Motion Captured Cinematography “Virtual Camera”
After spending some time with the virtual camera system used on Avatar, seeing what worked and what could be improved, I got together with the amazing Motion Capture team at Electronic Arts to build a system which just might be the most sophisticated virtual camera system available. It was first used on the videogame Madden NFL 12, and then extensively for all the cinematics on Need for Speed The Run for which I was the Director of Photography.
What’s a Virtual Camera system? Virtual cameras are like motion capture for camera performance. Where motion capture technology allows for incredibly realistic human motion to be put into the computer for CGI and video games, virtual camera technology allows for CGI scenes to be shot with ‘real’ cameras.
Here’s how it works. A giant room is filled with dozens – sometimes hundreds – of HD videocameras arranged around a capture space approximately the size of a tennis court. The cameras are calibrated in space and equidistantly arranged. Inside the capture space, a ‘virtual camera’ – simply a small structure containing reflective markers and is about the size of a real video camera. The motion capture cameras see these markers and can determine the height, orientation and can track the ‘virtual camera’ movement with incredible precision. The virtual camera has a monitor on it and is often placed on a steadicam system so it can be moved around the capture space just like a real camera. The dataflow goes like this: Motion capture cameras determine the position of the virtual camera, the virtual camera’s view is rendered out – showing whatever is in the CGI world – then this image is broadcast wirelessly to the monitor on the virtual camera 30 times each second. This allows a real person to shoot a CGI scene with a ‘real’ camera. Amazing.
Why would you go to all the effort? The reason why is that we have spent over 100 years making cinema and most of our lives watching it we all have an inherit subconscious understanding of how a camera moves. Cameras have mass, they move based on the laws of physics, they don’t suddenly stop or start. Steadicams have a feel to them, as do handheld cameras. Interestingly, we know this texture to the movement even if you’re not a cinematographer. In the computer, when hand animating cameras, great animators can sometimes get really close – just like animators animating a human character, but there’s some complex mojo going on when you haul a 50 pound steadicam rig around and I wanted to capture that nuancy.
Here’s a clip from NFS ‘The Run’ showing the results. I operated the steadicam in real life, shooting this CGI scene. Note that because the objects are just ‘virtually’ in the camera space, it’s possible to have the camera move through difficult situations with ease. See how the camera glides over the table after they get up to walk into the back room.
Here’s another scene where we come in the front door of a restaurant and land at a 50/50 by the table. Look how the virtual camera sees the 3D set ‘in’ the real world. There’s a screen on the camera and a big screen on the wall behind which is also showing what the camera is shooting. Because we’re capturing the virtual camera on a steadicam, the motion is completely realistic, yet we’re shooting a virtual world.
Here’s a near-final version where we drop out of the clouds and blend into a steadicam shot.
Together with some very talented programmers, I designed a virtual / procedural camera system called Cinebot. It was in development for about 5 years and it’s been added to the amazing game engine, Frostbite. The goal was simple, but the execution was deliciously challenging: Make video games feel more like a movie. Make video games feel cinematic.
What’s a virtual camera system? A virtual camera system is like a robot director of photography. In film, everything is directed and crafted. Stand over there, walk in that direction, we will put the camera here and use this lens and transport mechanism and compose it like this. Boom. Respectfully, it’s relatively easy because you have control over everything. Also, if something goes wrong, you usually get to try it again. With video games, the situation is very different. You want to make the game look as cinematic as possible but there’s this additional extra dimension: The user has control! They can run and jump and drive and effectively do an infinite amount of things. How the heck do you shoot that? Especially when you’re not there to do it, you need to program something to shoot it real-time while they’re playing!
What I did, was design a cinematography robot which lives inside the video game and can be instructed to do certain things, but it also figures out a lot of things for itself. Cine-bot.
There were numerous fascinating challenges. How do you make a robot compose pleasing shots? How does a computer ‘see’? How do you emulate the mass of a real camera? How does a lens alter what’s being shot and how do you make a computer render approximate that? Probably the biggest challenge was how to make the game feel cinematic yet not get in the way of good gameplay.
Here’s a debug image showing how the ‘freelook’ cameras can move around a character by user control. Limits can be set which make sense for gameplay while still having decent composition. All camera parameters can be crafted by the game DOP, including how the math blends, camera ‘mass’, overshoot, rotation acceleration, etc. All the user should feel is “Hey, this is a really cool freelook camera, it feels… right”
This is a relationship nodal graph showing a bunch of Cinebot cameras (green blocks) and how they blend or cut into each other (yellow blocks). Having a nodal view like this helps visualize custom transitions between each and any camera. You can set a blend between cameraA and cameraB, but a cut if you ever go from cameraB to cameraA. This is effectively like having as many cameramen as you want and defining exactly how the edit would occur between each and any of them, yet the actual cut is created in real-time as you play the videogame.
Here’s an example showing traditional animated cameras, Virtual Cameras and Cinebot all working seamlessly together. The playable sections are Cinebot – they’re dynamic procedural cameras which follow the action yet they feel cinematic. I also used some realtime CG depth of field effects on some shots, which is explained below.
Depth of field was implemented to most accurately model what a real lens does. There is a relationship between aperture, lens focal length and circle of confusion – or blur. A number of games don’t quite get the relationship right, which can cause excessive blurring – like on a wide lens – which yields the most likely undesirable miniaturization effect. You can do anything on the computer, including a lot of incorrect things! If you’re making CG cameras and lenses, you need to really dig into what the real ones do.
Facial Animation | Electrooculography
I’ve been involved in CG facial animation for over a decade. While at EA, I was one part of a two-man team which was the first to ever implement geometric facial animations in a videogame (NBA Live 98). From there I kept pushing the technology forward, working with and designing numerous systems. Wired Magazine did a little story on me and the technology we used with House of Moves. The project was Need For Speed: The Run, and we used hundreds of HD cameras arranged in a room, giving us sub-milimeter precision with the facial markers.
3D camera animation analysis from 2D video
There are some amazing camera performances in movies. Imagine if you could extract animation information from a movie – turn it into 3D movement information – and put that into a video game or any CG scene? It’s possible. While working on The Need for Speed, I was interested in getting the game cameras to provide a feel like what we’ve seen in movies. How could we make our game feel like the crazy car chases in the movie Ronin? I wanted the car to feel as fast and frenetic as the car chases were in that ridiculous movie.
I extracted a number of key scenes from the movie and prepared the footage. I have some software which you can apply to video footage and with some massaging and manual work, it’s possible to analyze and determine exactly how the actual movie camera moved through the world at the time the footage was captured.
Here is an analysis of one of the street runs in Ronin. The yellow x’s are placed markers. By analyzing how the markers move in relation to each other, it’s possible to extract 3D camera animation. The graphs at the top are the cameras rotation values. Boom, we have 3D cameras which feel like the car chases in Ronin!
Let’s summarize: 3 dimensional camera animation is extracted from regular video. The actual camera animation from any movie can be determined through this fascinating process.
Here was another great sequence driving under the bridge. Look at all that juicy animation data! It’s basically impossible to try and make procedural noise have the texture and life and believability of this kind of data. We’re getting road imperfections, suspension compression, tire compression, all the delicious physics moving the car/camera are being captured.
The cameras inside the car weren’t totally bolted down and there was a little movement. It had a really nice weight to it – the camera weighs something, the car lurching around causes the camera to react…. but what’s actually happening? Throw some markers on the dash and analyze. I extracted some really great interior car camera animation data.
Here’s a shot of the CG camera moving in 3D space, extracted from 2D video.
Overall this process works incredibly well. You can never have too much reference, reference is everything and if you can find some way to actually extract data from that great reference, all the better!