Five years on from the release of Doom Eternal, Doom: The Dark Ages heralds the arrival of id Tech 8 and a (demonic) host of technological innovation for both consoles and PC. We spoke with id Software’s director of engine technology, Billy Khan, to find out how it was built. Along the way, we discussed ray tracing, shader compilation and traversal stutter and the engine’s new features, amongst other topics.
It’s rare for developers to share so much behind-the-scenes footage in a public forum, with id providing some fascinating insight into how they managed to balance the classic need for excellent performance with graphical fidelity, while also adding in the physics-based interactions and larger levels that set The Dark Ages apart from its predecessors. It’s clear that a fanatic attention to speed and optimisation is key to the whole endeavour, and it’s interesting to see some specific examples of that in this discussion.
Note that this is a comprehensive, even feature-length interview, only a portion of which is reproduced in text below. As always, small edits have been made to aid clarity and readability. To listen to the full interview, please see the video version embedded below.
What were the goals that you had in mind from the beginning when planning id Tech 8? What was that process like?
Billy Khan: It’s a long-standing process – you don’t finish a game and immediately say “what are we going to do next?” Some of these things you’re seeing now are things we thought about for many years – up to 10 years for myself. We put modular components into the engine that we can expound upon as time goes by. Some of the features you’re seeing in id Tech 8 started off in id Tech 6, thinking about how we need to be able to handle these types of workloads, these types of graphical fidelity, these sizes of maps, these types of complexity in our pixels and lighting etc.
Whenever we make an engine, it’s purpose-built for the game we’re working on. It’s very important to us that whatever we provide has a direct impact for the player, and the immersion that the player feels is related to the content. So it’s both, it’s about thinking what we need for the game and making sure all of these needs are met, having forward plans, and some more ad hoc stuff. As you develop, you find some things work and some things don’t.
Those are usually the gems that you get when you collaborate between departments – it’s the symbiotic relationship between the artists, designers, engineers and producers all working towards a common goal. It’s often wrongly attributed to just one department or a few individuals, but it’s not the case – it’s a team effort. That team communicating and working together makes the magic shine, and our job on the engine side is provide all of the capabilities for the teams to do all of that work.
A lot of games stutter on PC, especially in terms of traversal stutter and shader compilation stutter. It’s a complex issue that varies a lot across engines, but it’s not a problem we’ve encountered in The Dark Ages or prior id Software games. Somehow you’re able to do these massive games with extremely short loading times, no obvious shader precompilation step, and yet no stuttering. How are you achieving this?
Billy Khan: There are a lot of moving parts, but I think one thing is our philosophy when writing code and approach our engine. When it comes to shaders, one of the main things we try to do is have a minimal set of shaders with multiple facets – it’s easier for an artist to understand a complex shader rather than having thousands of shaders to maintain. When the vast majority of the content is affected by this minimal set, mistakes make many things slower, but improvements make many things way faster. Artists can understand how it works, find nuances and places for improvement, provide feedback, and we can iterate on those concepts and make them better. If you have thousands of shaders, it’s difficult to polish them all to the same level – some will work really well, some that look really great, and some that are in the middle.
Then there’s the business of deciding when these things get executed. The GPU has a big problem when it constantly has to context switch, so we try to pay attention to that. We are also particular about taking latency into account – we don’t want any big variances between frames in our passes of workloads. For example, if we have a certain workload to do, we want to be doing that same type of work every frame in a way that’s predictable – we don’t want to have work that sometimes takes 5ms and sometimes takes 100ms. That variance can cause frame bubbles, where you have really fast or really slow frames, then the GPU gets out of sync with the game and that causes issues. It’s important to keep your frame-time in check and as homogeneous as possible. That takes a lot of effort from all departments on all fronts, not only coding but also in terms of artists creating assets and adding them to a level.
To see this content please enable targeting cookies.
One of the big engine features is a focus on ray tracing, with The Dark Ages adding RTGI to the existing ray-traced reflections from Doom Eternal. How did you implement this while ensure it was still 60fps on consoles, even on Series S?
Billy Khan: Ever sice I started programming and graphics, I wanted to do real-time ray tracing. We saw the way the hardware was progressing, we made those changes in id Tech 7 and we thought it was on the precipice of being really great. We knew we wanted a much larger title, envisioning grandiose battles with many more enemies and gigantic levels – five times bigger than in Eternal or 10 times bigger in the Atlan levels. That meant we had to create a lot of content quickly, which was a big focus on id Tech 6, 7 and 8. The more you can quickly iterate on it, the more polished and diverse the gameplay encounters can become.
Ray tracing allows us to do WYSIWYG (what you see is what you get), replacing the approximate representation used by artists and designers with exactly what it’s going to look like on PC or console, without bakes that take hours upon hours. You can imagine what kind of speed improvement that is. It’s a 10x if not 100x improvement in iteration time. A lighter can now move a light in the middle of a meeting if we want a spot to be darker, for example. You can imagine what kind of power that gives to the development from a creative side.
The visual fidelity in our cinematics has become much higher, attributed not only to ray tracing and the atmospherics we’ve worked on, but the pipeline we have now that allows artists and animators on the cinematic team to work very diligently and quickly to polish these things. The cutscenes are running in real time every time you play the game, so we can replace the content, we can insert things at run time, we can show different player skins. So ray tracing was just a natural way of not only improving the title to make it look really great, but also allow us to make a better game.
Without ray tracing and with the same design goals, we would have had to elongate the time by a magnitude of years, because we wouldn’t have the ability to create the same type of content. From a feature standpoint, we want to provide more feedback to the player and make the Slayer feel really powerful, even more so than in Eternal or 2016, so we added more immersive gore and world destruction. That makes the weapons feel more powerful. If you have a baked lighting solution, some of those things can feel out of place, especially when you’re in a mech smashing into buildings and tearing chunks off. You can’t bake for these dynamic objects, so the ambient lighting would look off without RT; with RT, all of that stuff looks homogeneous and more real.
Is ray tracing included across the board on all console versions?
Billy Khan: Luckily, all the platforms have ray tracing, which is great. There are obviously differences between platforms, but we were able to make sure the majority of the game feels nearly identical across each platform. There are some resolution differences, but the frame-rate is the same. When it comes to RT, everything has RT; there are some nuances to when SSR might kick in for some of the reflections. There are some compromises we have to make, some minor details in the composite materials. But for the most part, everything is rendered through the same process. We didn’t write a PC version versus a console version, it’s just id Tech 8 running everywhere, with some knobs we can adjust to make that work.
Aside from some foliage here and there, The Dark Ages does a fantastic job masking any kind of level-of-detail (LOD) transitions, despite the open maps and long view distances. Is it a virtualised geometry system, like Nanite, or something different?
Billy Khan: It’s not a new geometry system as you’re describing, though we are working on something like that. What we’re doing here is expanding the number of LODs we have in terms of vegetation, and sometimes adding interpolation for some of the LODs. Some of the grass foliage is dynamic GPU grass, which can scale through heuristics with the GPU based on distance and tessellates a certain amount.
We also have a new system called auto vista LOD generation. In games, you might see a distant castle that you’re going to reach at some point. Back in the day, you’d make a vista model, like a skybox or custom geometry, of what the real thing looks like when you see it from far away. Now, our level designers are able to build these play spaces, populate them with AI and content, and automatically create vista LODs without any hand-crafting. Because we do it automatically, we can create LODs that allow us to fade between them in a way that’s almost imperceptible. As you’re running towards the castle, it may be transitioning from an auto-generated LOD to something that actually formulates all of the individual pieces.
We have this concept of contribution culling, where we have all of these different features that we can dial in based on distance and tune how they fade in and out per platform to gain performance. It might not be moving from one LOD to another every time; it might also be adding some features, going to simpler shaders. We want better pixels, rather than more geometry. If the pixel is lit properly and shows the right data, then it doesn’t really matter how many polygons you have. It’s all these systems working together to give you that effect of being detailed at any distance.
Another thing that caught my eye in the demo was the physics system, where you have Crysis-style shacks that are constructed from small segments that have physics and can break apart. How did you get this in the game, and what’s the performance cost? Was this a big part of the design, or just something that someone made?
Billy Khan: All of the things that we do are planned and add meaning to the player in some way. We want you to feel powerful, and if the gigantic projectiles you’re firing don’t have an impact in the world, that diminishes the impact. So we wanted to have not only lots of enemies on-screen, but a dynamic gore system and world destruction. We used Havok to build tooling to create content quickly, and worked very diligently on optimisation. We have thousands of objects working together, so they had to be stable and realistic.
We had to do a lot of work on the GPU side, because the more stuff we can push to the GPU, the more CPU time we have for simulation. In fact, because the entire scene graph is on the GPU, the CPU now has the luxury of being able to focus just on physics, sound and gameplay. We have a highly optimised job and scheduling system that takes all of these varying components and arranges them like a jigsaw puzzle to execute as fast as possible. We were able to make sure that the simulation was always stable, and it could also run in a higher frequency to become even more stable.
I also noticed that, once a structure has been knocked down and there’s debris on the ground, the simulation remains active rather than deactivating after destruction to save performance.
Billy Khan: Yeah, instead of being turned off, we put it to rest unless there’s a new impulse coming out, and then we can reactive it and let it do things, and only simulate it when needed. There are some pieces that go to rest always, but other things – like smaller pieces – should move if you shoot them.
It feels like Doom: The Dark Ages is the first since Doom 2 to have large numbers of on-screen enemies. How was that accomplished while maintaining performance?
Billy Khan: The physics play a big role in that, with highly optimised ragdolls, and a majorly overhauled animation system that runs on the GPU. That freed up more cycles so we can do more characters, more AIs running around with their own thought process and how they behave and react.
The team also spent a lot of time optimising, like how we access memory. Jacob [Bringas] and his team worked on a new way of taking all the entities that we had in the world and made sure the memory they consumed was aligned in a very particular way, so that most (if not all) of that data is actually meaningful data that you want to use for the current time of the calculation.
In the past, one of the major slowdowns you have in CPU time is that when you have a bunch of entities with their own data, you have to fetch all of that data when you do something with that entity – and the data might be in 10 different places, and it can be very very slow. That’s how you get cache misses, which is a bit like cooking a recipe, realising you’re out of onions and having to drive to the store to get one before you can start cutting it up and putting it into the dish. What you want is to have all the ingredients you need to make your dish right then and there.
Jacob and the team worked on a factory system and an alignment system where entities are now component-based in a way where the memory they need to work is right next to each other, so if I need to do AI pathing, I can fetch the information for one entity and have the remaining entities prefetched into memory so it’s ready to be worked on. You can get a 10x speed improvement by doing that, just by fixing your memory fetches. You have to be really clever about it, balancing what data is resident nearby, what’s a little further away. That takes timing and understanding the behaviour of the code, so the AI team has done a wonderful job, on top of the wonderful work Oliver [Fallows] did on the scheduler, including scheduling work on hetrogenous CPUs where you have efficiency and power cores.
How does this level of optimisation apply to loading times? I was astounded to see The Dark Ages loading faster than Eternal.
Billy Khan: That’s intentional! I personally just dislike loading times – I don’t like waiting on things. We all have a limited amount of time to play games; I want to maximise that time. Oliver and the team worked really hard on a concept of sector streaming, taking these gigantic worlds and compartmentalising them into cells or sectors or spatial types of grids.
Traditionally, we had to load in the entire level, and that takes a lot of time, but now we can cram all the content we need for the entire level into one or two sectors. That means you can have much higher fidelity of data in much more dense worlds. It also means we can predict where you are and stream things in before you get to the next area. So when you first start the level, you just need to load the things in the immediate vicinity.
There’s a lot of work on the compression side; when it comes to images in our texture streaming, everything is streamed in now, unlike prior titles, all our animations, all of the geometry through a world geometry manager, vista LOD generation, DirectStorage on Xbox. Our stream database is really optimised to make sure the data that we need at the time is loaded in a particular way. That leads to sub-second loads many times. It was so fast that we had to put a button in to let the player know they could actually play, because they were still reading the text. It’s a great problem to have.
Maybe we could do no loading screens for id Tech 9 – this is just hypothetical, but maybe you never see a loading screen, and you just play the game with one long stream. There is a sense of finality and rest when it comes to having loading screens though, a sense of accomplishment that you’ve finished a level, so we might not want to do that even if it was possible.
For even more discussion with Billy Khan, please see the video embedded above or available here for the full discussion, which goes into the ultra-stable code base that made the rapid iteration possible, the future push for full RT, how cutscenes were improved in The Dark Ages and much, much more.