When you see an animated movie like How to Train Your Dragon 2, you may not think about the technology behind its special effects. But the creators and artists responsible for the movie toiled on it for more than five years to deliver flawless animations that are so lifelike that they don’t catch your attention. You don’t notice small bugs. You just marvel at the realistic water or the emotion in the faces or the shimmering of Hiccup’s leather armor.
But the tech behind the film represents the “absolute pinnacle” of technology and creative media, according to DreamWorks. To make the movie, DreamWorks Animation had to remake its computing infrastructure and create new technologies like Apollo, the platform for making the film, and Premo, a tool that artists could use to build images in real-time. These tools make the artists behind the animations much more efficient, tapping both the infrastructure of multicore computers and cloud computing.
Lincoln Wallen, chief technology officer at DreamWorks Animation, and Pete Baker, vice president of software and services at Intel, talked with VentureBeat about the tech behind the movie and the whole foundation of computing infrastructure that allows the company to work on ten movies at the same time. Those movies lead to the creation of more than 5 billion digital files. Here’s an edited transcript of a section of an interview with Wallen and Baker.
VentureBeat: What’s the tech behind the movie?
Lincoln Wallen: What you’re seeing here is the absolute pinnacle of creative media cultural product. It’s the top of the stack there. In our movies, we want and aspire to have the top of the deck. Dean’s produced a fantastic movie. I know you saw a little bit today. I hope you’ve seen it. If not, absolutely see the rest of it.
How to make How to Train Your Dragon 2
We went deep inside DreamWorks to find out how it used cutting-edge enterprise and animation tech to make this summer’s blockbuster animation.
We’ve got creative and Hollywood here. We have, again, the archetype of Silicon Valley and the core disruptor for the last century and heading into the next, which is silicon. Moore’s Law has been a key transformer of every business and every consumer product in my lifetime. To have Intel as a substantial part of this achievement is natural.
On the other hand, you’ve seen the relationship with the creator, but we also have a software aspect here. Pete leads the software and software services group at Intel. I sit here in the middle as the sort of CTO, CIO, chief disruption officer that’s trying to bring about radical change within our business. These are the elements. This is the customer and these are the enabling pieces. Software is the thing that makes the difference and makes the silicon shine. Intel has recognized that with the amazing resources and effort put into software tools, software libraries, and software resources.
VB: What’s the investment here?
Wallen: What we’ve done is decided to invest in what I hope you’re now getting a glimpse of, which is a radical new way of putting these elements together. Dreamworks has always been somewhat unique, somewhat at the cutting edge. Kate made a reference to the fact that more than 10 years ago, the company made a decision to own and manage its own production platform. Even at that time, that’s a significant decision, to be proprietary and to invest in the engineering resources.
To give you a different perspective on that, 10 to 15 percent of my technology resources are what you would call IT. 85 to 90 percent are what we call AT, or Animation Technology, which is about the delivery of business value, not simply the operation of the enterprise. That’s allowed us to manage a platform and take ourselves to the cloud in the first wave, very early on, working with companies like HP and Red Hat to build key elements of what we today call cloud computing. We were already in a place where we had a sense of one element of the compute continuum, which is scalable data centers and infrastructure-as-service compute platforms.
The other piece was still challenging. It was still single-core, generally, starting to become multi-core. We knew that, to respond to that, we needed to put these two things together in a seamless way and not recognize the boundary between client and data center. We had to architect a platform that allowed us to move data and compute load across the two. That’s where the partnership with Intel was critical, because they know their silicon best, but also getting the best out of that software. Now we’re talking about threaded compute down to the few microseconds, being able to measure, schedule, and allocate resources at that level. We took a view that the client architecture, the IA architecture, was a mini-data center.
We took the cloud computing model and turned to focus on the client side. That’s one of the reasons why, working with some substantial enabling software from Intel, we were able to create a cloud model on the client, on the multi-core system. That’s why, when we add more cores to the workstations, the software just goes faster or can do more. It’s now a highly efficient and effective distributed compute platform in its own right. Completely seamless transition from that into a wider data center. We can put these processes together at will in order to create a different type of user experience, some of which you’ve seen and some of which we’ve talked about as we go forward.
VB: How does the tech affect the artists?
Wallen: The impact of that on the artists was that we were able to go into design processes, as we mentioned, that didn’t start from, “What can we do?” but more, “What do you want?” They gave us incredibly pure and clear ideas about how business processes should be organized, if they took off all the constraints. You think about that process taking place in many other businesses that take a step toward being first of all digital, second of all cloud, and third of all using the compute continuum in this way. Then you can see the enormous changes that this could bring on already-available infrastructure, software, and component pieces. It’s about putting them together in the right way.
What I take away from the movie is the courageousness of the camera and the acting. The animators chose to have sequences and action in the movie — not just fighting action, but subtle, emotional action — that animators would cringe at doing, whether it’s the closeness of expression or the emotional points. One scene I love is with one character mimicking another. It’s incredible, the aliveness about those scenes. The ability to explore was one of the key things the animators got back.
On the enterprise side, we’re able to sit back and move resources around, apply compute at exactly the place where it matters most. The combination leads to better movies done more efficiently with more flexibility, more agility, and ultimately lower cost. It’s an amazing commercial, artistic, and enterprise achievement. Kudos to Intel for recognizing the opportunity and partnering so close with us.
It’s also enabling some elements of this with very key libraries that either manage threading or manage scheduling – engineering resources that went down to the chipset level so that we could optimize or vectorize multi-core processing. All of these were necessary to pull it off.
VentureBeat: Are those libraries specific to animation, or can they be applied anywhere?
Wallen: Anywhere. One of the most interesting things about this is that almost all of these applications, right until you put the user work flow on top of it, they’re highly scaled compute models for doing whatever you like. Putting simulation in there, putting financial calculations, all those sorts of things are computable. Premo is essentially an ensemble compute engine. If you know about how weather prediction or large-scale scientific computation is done, that’s what’s going on there from an architectural point of view.
The libraries are very generic. They’re either scheduling or threading libraries, at the low level. On top of that is an architecture that knows how to put this together as if it were a large-scale database.
Pete Baker: I’m a vice president in the software and services group. I was thinking while I came down on the plane from Portland, hearkening back about six years ago. On paper this is a really curious marriage. We had a collection of silicon chip folks and artists and storytellers. How do we come together in a shared opportunity and get something out of it?
In reality, Intel has a host of software engineers. We have thousands of software engineers. There’s the traditional folks who write BIOS drivers and firmware, but the group I work in also has the privilege of dealing with third parties to make their software better. That could be defined as faster, or as taking advantage of new capabilities, utilizing our tools to do so. We have some of the world’s foremost algorithmic and optimization experts. That makes sense. We know the silicon intimately. That knowledge allows us to convey insights into software, be it our own or others’, that is unique in the industry.
We fast-forward about six years. As I’m reflecting back, I said, “This was a little more than curious. It was fascinating.” We thought we went into a partnership with a collection of artists and animators and storytellers. They’re actually quite a technology company as well. They’re so enthusiastic about the technology and how to use it to convey those stories, those emotional things that people can see, that jump off the screen. We have that shared enthusiasm for technology and bringing stories to life.
The realities are that we could bring to bear, of course, the benefits of the performance and capabilities of our silicon roadmap. That’s table stakes. Beyond that, we also have a host of software tuning, optimization, and creation tools that we’ve been able to bring to bear against the problem, as well as these software and tuning experts. We’ve been working hand in hand now for five-plus years — designing the software, optimizing the software, making sure that the work flows are clean and useful and work best on our silicon. It’s so gratifying to see the fruits of that labor on the screen.
Wallen: They used to be in an analog world, where they drew on paper. That wasn’t so long ago. Then we dumped them into the digital world and made them act like CAD engineers. Now we’ve managed to put them back in a purely creative experience, an analog experience, where they can see a curve and draw it. They can achieve it in the same way. Their graphic skills, their visual skills, are now immediately reflected by the behavior of the digital medium. That all comes from sheer power.
VB: Does everyone use the electronic stylus now? Or does anybody still use paper?
Wallen: Our storyboard artists sometimes do, but generally they start drawing on a Cintiq, so it’s digitized and easily manipulated. Certainly it’s the case that some of the animators have become so used to a mouse and keyboard that it takes them a while to say, “I don’t have to do that anymore.” It’s been an interesting transition. But all animators have these setups now, with adjustable Cintiqs to get the right ergonomic position.
Baker: It’s been very freeing for them. A few people have to make the transition from working with—They know where everything is on the keyboard. They’ve been trained to translate their creative impulses into these numeric entries. But now, once they’ve spent a couple of weeks to learn how to use Premo, for example, they don’t want to go back. There were a couple of instances where animators had to go backward. They did that animated Christmas card, and a couple of animators had to go back into it. They said, “Oh my God!” They’re completely invested in this way of working. It feels so natural now that to go backward felt like a real slog.
Wallen: One of the things that has allowed us to move so fast with such a radical change is working so closely with the animators. My engineering team’s daily working in with the animators, refining the work flows, as well as working on the underlying architecture. The whole software delivery process was pretty much like a movie, with directors at the helm. People like Fred and Jason and Rex. Their vision was realized on the screen once more. But this time from a work flow point of view.
VB: You talked about how the old platform had roots going back to the ‘80s. With the new one, are you trying to build something with lots of headroom? 10 years from now, can you still be using this in some form?
Wallen: This was one of the core motivators. As you can imagine, changing the software underneath a business that’s making maybe 10 movies at any given time, moving artists around across the globe to bring those out, three a year, is not easy. You only want to do radical changes in a very careful manner.
The first radical change I mentioned occurred before this, moving the production platform into a infrastructure-as-service model. That was already in place. The artists across all the movies were using a common platform. That gave us a target, as well as an operating model. The cloud was a natural part of our environment. It wasn’t something new. It was something to be exploited. It also wasn’t something we were using to somehow reduce the data center. It was there as a tool.
Then Intel came along and said, “Yeah, well, it’s four cores at a time now, but we see 60 over here. Let’s start talking about Xeon 5.” This is an interesting point from a silicon point of view. The question here is, what is the scalability of the computing model? The single-core IA architecture is a very scalable computing model. It’s gotten faster over the years. Multi-core vectorized starts to make you think about other forms of compute. The question is, how easy is it to move code on and off of that, between CPU and anything else? How do you scale it across multiple CPUs? The commonality of the IA chipset across any one of these types of platforms, whether it’s a Xeon 5 with 60-odd cores, or whether it’s a Xeon with eight cores today, is irrelevant from a compute model. We’re able to compile and integrate. We can run Steven’s image on one core, or we can run it on a thousand.
That was critical. But we wanted to put an architecture in place and build an architecture that would naturally scale, regardless of how the platforms changed. We know we can scale the data centers. But we also need to scale at a micro-architecture level.
Baker: We like to think of it N-way parallel, so it’s extensible to the future.
Wallen: Yeah. We’re in no trouble with that. We’ve been through two generations now of putting machines in, running software, and all the goodness comes through to the artists. We use that as a way of adjusting the speed or expectations in the productions. When we put a new machine in, we know how much more effective that is through the software. It’s like turning up the clock.
VB: There have been complaints in recent years as the demand of complex data kept accelerating. Rendering time wasn’t able to come down. Are you able to solve that now?
Wallen: Computing something in some abstract sense is just going to take the time it takes. You can improve that computation and take out all the inefficiencies in it, but in the end you’ve got some work to do. The question about time to compute on a single-core model is literally about the speed of that core, or how many of those inefficiencies you’ve been able to take out.
When you have a scalable compute model, it’s no longer able the individual efficiency of a given piece of operation. It’s about your orchestration. Can you get the data to a larger and larger number of cores to design just how fast you want that computation to take place? That’s what we have achieved. It’s scalability. We can adjust the time frame in which a given task takes.
The reason we can take the proxies, those reduced-complexity characters, out of animation is because it really doesn’t matter how complex it gets. We can always scale to keep the frame rate where it needs to be for the animation process. We can be courageous in designing into the software that ideal workflow. We’re not continuously hedging against whether or not the filmmakers will come up with an idea, like a massive dragon, that will blow the compute budget. That’s a memory question and a compute question.
That’s the feeling that the animators get. I know Dean felt, as he got into the experience of what the production artists were able to give back to him over what period of time, that we can aspire. We really don’t have to think about, “Is it doable?” It’s about what’s best. If there’s one message to take away, if you think about businesses looking at their work flows and saying, “I can decide what’s best? That means I can take massive steps in terms of quality of production, agility, and bottom line.” It’s a big deal.
VB: Are there any tools you dream of that you still don’t have, then?
Wallen: The old way, which is to do that in stages, means the animator isn’t really in control of the final process. But when you can compute on demand, essentially, you can embed those processes inside the animator work flow. The animator still sits at the end of the process. That’s enormous. You can really put your hands inside the final movie and start moving things around. There’s no reason you can’t. It’s just a question of, “Do you want to do that?” Do you want to do it now or do you want to do it when Dean’s in the room? It’s a matter of judgment based on your outcomes.
For our artists, it makes a big difference now, because historically, with the legacy software, as powerful as it was, it was legacy. It was becoming very difficult to do new features with. What we’re seeing now is that even in the life span of one movie, Hollow changed radically in the hands of the artists. What we’re seeing in the next year or two years — film, film, film — is that what you saw today is version one. We’re at the infancy.
That’s why we talk about a platform. Again, in the visual arts and in many other areas, software typically comes to users in storable packages. It’s about an isolated feature set where you think about, how do I get a data set to work on? You do the work and do something with it. As you can see, hopefully, by revealing some of the underlying elements of Torch, we’re really talking about an image creation platform.
What that means is, you’re no longer talking about individual tools installed for individual people. You’re talking about, how does somebody fit within a much wider collaborative process? How can you use that larger resource to make that individual’s work that much more effective? Orders of magnitude more effective.
That’s a very different software paradigm. It’s reminiscent of the consumer paradigm that we’re all using now with apps on cell phones, but that paradigm used with significant compute for value, not commoditized compute where it’s bits of data moving between points. This is a whole design and manufacturing process. That’s why we think it has far wider implications.
Baker: The platform software, too, it has a unique advantage in that it’s designed with the customer right there. The folks who are going to use this look over the engineers’ shoulders saying, “No, I want it this way.” It adds so much to the resulting products, and they’re continuously evolving because the customers are demanding high quality. They want to plug it right into their work flow and get to work.
VB: How has it pushed what you’ve been able to do?
Baker: It’s been fascinating for us. Lincoln talked about some of their data constraints and compute constraints that they had hit up against previously. We look at that as a wonderful opportunity. You guys want to do what? How many MIPS do you need? Let’s go.
Wallen: One of the key things [is that] Intel’s tools around the IA architecture are fantastic. We had one challenge, though. We were scheduling down to the microsecond. And so exposing that data through Vtune and other instrumentation tools, it was like, “Okay, guys, this is what we need.”
VB: From Intel’s perspective, are you learning things over time that apply to applications consumers care about? It seems like someday the web might be able to use a lot of this stuff. When we use web applications, we have lots of cores at our command too.
Baker: Yeah, I think so. You asked earlier about applicability to the games environment. Some of the stuff is very similar to how game environments work today. It’s just interactive, right? All this learning we’re developing with pioneer technologists like DreamWorks, we can apply that back to our products that will eventually show up in mainstream users’ hands. It may take a while, but that’s ideally the intent. We need to make better products that are more performance-oriented, more competitive, and cheaper. What we’re talking about in workstations here can end up in your cell phone 10 years from now.
VB: If something did come along and change everything, like virtual reality, could you adapt all of this?
Wallen: We don’t need to adapt it. The beauty here is that we are movie-makers. This is a movie studio. But we make very particular types of movie, in the sense that we generate our images entirely. We have total control of the image, which means that when you talk about something like virtual reality, where you’re playing with depth and projection, that’s trivial. We do that all the time with stereo. This platform is incredibly well-set-up to generate images for consumption environments that have all of these characteristics.
This is very different from live-action movies, where essentially you’re capturing data and that fixes its characteristics in your media. That’s one reason why a lot of live-action directors are moving toward the image-generation paradigm. It’s just that much more flexible. You can go much further.
We have in fact already done this. These technologies are new, emerging. They’re not really in the consumer’s hands yet. But we used the Oculus equipment and worked with that company, producing a dragon experience to promote the movie. It was used in New York and I think in Europe. We’ve done a couple of publicity stunts where our team, with the characters Dean’s team created, built something that can run up to five minutes. You control the whole thing, flying on a dragon.
VB: You integrated the cloud. You integrated multi-core. What about GPU computing? Where are you with something like this kind of change?
Wallen: All of our machines have a GPU, all of our client machines, because they use the GPU to put images on screen. We do not make extensive use of the GPU architecture for compute. It’s fantastic compute, delivering a hell of a lot of FLOPs, but it comes back to the point I made earlier about the flexibility of the compute model. We’re running an enterprise whose data loads are varying wildly from project to project, with a software platform that has to be able to accommodate that type of variation in throughput. The flexibility in the compute model is critical.
Certainly to date, the GPU architecture has not delivered the sort of virtualization, allocation, orchestration capabilities and instruction set flexibility that we need to do this type of scalable compute. That’s why Xeon 5 is so interesting to us, because it gave us the vectorized benefits that you often get in a GPU, but with exactly the same compute model. We can run software on a CPU. We can migrate it to Xeon 5 and back again. That’s all about a business choice, about how much scalable compute for this particular process at this particular time for that particular artist.
VB: How long has Dreamworks been on Linux?
Wallen: Our operating system is Red Hat Linux, that’s right. Part of the cloud platform involves the Red Hat MRG large-scale scheduling solution, which we were pioneers with them in refining and getting into this sort of enterprise. Essentially, in the wave of aerospace moving off of proprietary systems and on to commercial off-the-shelf, that was the end of Silicon Graphics and IRIX. The visual effects, advertising, and animation industries were all on SGI machines, all on IRIX. The question was, what are we going to move to? Dreamworks partnered with HP at that time in saying, “The right operating system is Linux.” We worked with HP to qualify and produce the first Linux workstations that then became standard and have been ever since, standard in the visual design and production industries. We’ve been on Linux since then, since IRIX.
VB: For the art style, there are a lot of cartoon faces on the humans. Do you ever want to try out more realistic human faces for any of your work, besides maybe this one? Or do you think it’s not possible?
Wallen: The issue isn’t actually the image. It’s the behavior. It’s the ability to direct performance down to that granularity. One hope I have is that with the sort of immediate craftability of the software, with Premo in particular, we’re able to approach that point from a different perspective. Certainly on the rendering side, we’re able to scan flesh. We’re able to reproduce that as a rendered process. But the behavior is the key.
VB: Do you want viewers to not be able to discern any computer-animated quality? Tangled his this combination, something more like hand-drawn versus computer-animated scenes. You could tell when it switched over. This part looks much more computerized than that part.
Wallen: One of the beauties of DreamWorks being one of the few, if not the only scaled animation production facilities and businesse. What I mean by that is that the same business and the same production house makes multiple movies in any given year. Just look at the range of movies that we make, from a stylistic and creative point of view. We don’t have a constraint in terms of the style that particular story needs to be told in. We’re not only doing looks that border on the realist. We’re doing very stylized movies like Peabody & Sherman and the upcoming Penguins movie.
The platform is the same, the tools are the same, but the artists have different visions. They can realize those visions as seamlessly as they can think of them.
VB: Were there any times when you felt like you needed something in addition to Apollo, or different from Apollo? Ubisoft is allowing their game developers in different studios to come up with different engines for creating games across the whole company. They could have half a dozen different engines, kind of like maybe having half a dozen different Apollos. It doesn’t seem very efficient.
Wallen: I came to DreamWorks from Electronic Arts, which I think is still the biggest by development resources, and an environment where different studios have different engines. I can take you back to the first transformation that came to this business, one that for me was one of the main reasons why I was so excited to come and work at DreamWorks. DreamWorks had already taken the step of transforming the business to operate on an enterprise platform.
The power that you see today goes all the way back to achieving that and driving through that type of change in the face of resistance from development or artistic or whatever. People don’t like that. They think of it as standardization. But having achieved that, you’re able to respond to these needs in a far more powerful way. More important, you can deploy those changes across the enterprise in a heartbeat.
All of our movies are on Apollo. We haven’t skipped a beat. All of the values that we’re talking about with Dragon are available to all of our projects. That’s the business reason why: We love new ideas, but our goal is to incorporate them as fast as possible into the entire platform, so that they’re available in many different places, immediately and across the business, so that it has the biggest business impact and product impact.
There’s a bigger story there. If you look at the media industry in particular, this is one of the biggest struggles in the industry, to get that scale and get that efficiency in technological innovation. Often you think, “If I can lock a few people away in the room, I’ll get innovation.” Innovation at enterprise scale comes from a different place. It comes from vision and it comes from partners and it comes from the ability of the platform to execute fast and deploy fast and have your business impact. That’s what justifies large investments.
VB: Is there a scenario in which Dreamworks would have more movies in production at a given time because of Apollo, or are there factors other than technology that mean you’re making enough already?
Wallen: There are other factors, like the marketplace. Jeffrey’s talked a lot about what goes into how many movies we make every year. A lot of it these days is just release date availability. There’s only so many movies that can open on any given weekend to great success. What Apollo does do is allow us to capitalize on that flexibility in a way that we wouldn’t have done in the past. It allows us to say, “Penguins in Madagascar was going to open in March, but we love that movie for November. Let’s put it in November.” It gives us an incredible sense of flexibility.
Business agility and product quality, those two things are the key. The agility speaks to your underlying bottom line of cost. The quality speaks to your opportunity in the marketplace. If you can achieve both together, you’re able to affect your outcomes in the business as a whole, which these days rests on our ability to be flexible in the marketplace.
We’ve talked here about movie creation. Hopefully you have a sense that we’re one of the biggest image creation businesses on the planet. We have a platform that makes that incredibly efficient, whatever the type of image you want to generate. You should think about the elements that have gone into this and what that means for other enterprises, other businesses, other types of consumer experience over the next 10 to 15 years. The young consumer is really only consuming video. Image is everything.
VB: The process of rendering, is that getting faster? How long does it take to render the final movie?
Wallen: If everything was in place, given our compute resources we could probably render a movie in less than a week. It doesn’t take that long, if everything’s in place. Now, how long does it take to produce a movie to that final quality? That’s a whole different ballgame.
In How to Train Your Dragon 2 we used 90 million compute hours to render the film. That’s in total, everything — not just the final images. That’s every iteration. The key question isn’t really how close to a release date you can afford to change things. That’s about quality.