Sunday, July 5, 2020

Under the Hood: Explaining SSAO

Today we'd like to explain a new graphical feature that we are introducing to our games with update 1.38. The article is very technical - we have asked our programmers to help out, and the explanation is quite complex. However, we felt that it may actually be interesting for at least some of the people in our audience to be exposed to this material - to see that what is happening under the hood of a game engine involves a lot of research and hard work of our programming team. In addition to the technical details, we thought that providing the context and explaining the performance trade-offs may be useful and important for most of the players.

Jaroslav a.k.a.Cim 
(one of our brave and skilled programmers working on graphics improvements)

The TLDR summary of the text below is that Screen Space Ambient Occlusion is a cool new but performance-heavy technique to enrich the rendering of our game world. You do not have to use it if you feel it lowers performance too much for your liking, or you may like it and can afford to trade a few frames per second for improved shadow and depth perception. The effect may be subtle, it mostly works on subconsciousness level, but once you get used to it, it may be hard to go back. It is another milestone in our lighting/shadowing improvements plan that we are now executing, to be followed by new HDR light processing and introduction of more normal-mapped surfaces in upcoming updates.

The technique has its limitations and quirks. It has been used by several AAA games in recent years, and even if it's not perfect, it helps the human perception system to understand the scene better, and we hope that adding it to the technology mix of our truck sims is beneficial. We will no doubt want to introduce additional ways of shadowing computation that will improve or even supersede it.

We are under constant pressure to improve the looks of our game by a vocal subset of our fan base. At the same time, there is always a desire to make the games run faster. On top of these sometimes competing requests coming from the playerbase, our very own art department is ever eager to get hold of new graphical toys to make our game richer and better. Whenever we introduce a new graphical feature into the game, we try to do it in a way that doesn't hurt the performance for players with older computers, we don't want to make the game incompatible for our existing customers. That's why there is an option to switch SSAO off completely, and several settings for its quality/performance.

The work of our programmers on the new SSAO/HBAO techniques also required changes to our art and art creation pipeline. All the 3D models in our games had to be revisited by the art department, and any instances, where any fake shadows and darkening were already applied to a model by an artist, were changed. For some more complex game models, this was a simplification that actually reduced the number of triangles in them enough to improve rendering performance. To some extent, we have traded a part of tentative future manual effort that would be needed for building individual good-looking 3D models for an algorithmic rendering pass that unifies the shadowing look for the whole scene, helping to "root" objects like buildings, lamp-posts, and vegetation to the terrain.

What SSAO stands for and how it works

Before we start - note that SSAO is a general acronym for "screen space ambient occlusion". The name encompasses all of the various ambient occlusion (AO) techniques and their variants that work in screen space (it means that they obtain all information at runtime from data that are rendered on the computer screen and into related memory buffers). There is SSAO (Crytek 2007 tech that basically gave a general name to all techniques), MSSAO, HBAO, HDAO, GTAO, and many more other techniques each using differently tuned approaches, each having its benefits and downsides. We have based our approach on a horizon-based technique called GTAO that was introduced in a 2016 paper by Activision.

The ambient occlusion (AO) name part means that we evaluate how much of incoming light (predominantly sky light, but sometimes the computed occlusion gets applied also to other lights) gets occluded at a particular place in the game world. Imagine that you are standing on flat ground - you would see the whole sky above, so there is 0% occlusion, the ground gets fully lit by the sky. Now imagine that you are at the bottom of a well - you would see only a small patch of sky, that means sky gets occluded almost 100% and contributes only a little to the ambient lighting in that well, and naturally it is quite dark at the bottom of the well. A specific level of ambient occlusion at a particular place affects lighting computations and creates shadowed areas in creases, holes, and other 'complex' places. It can get anywhere between 0% and 100% based on their surroundings.

Computing the occlusion in high detail and precision is resource-intensive; basically, you would need to shoot rays from any evaluated position in all directions and test whether they hit the sky or not, and then average the result. The more rays you shoot, the better information you get but at a greater computation cost. This process could be possibly processed off-line, like when the game map gets saved by its designer. Some games and engines use this approach. But that way you are only able to bake ambient occlusion information about static non-moving objects because there are no vehicles, no animated objects present at that time.

So instead of baking static information (which would also take a lot of time and storage space given the scale of our world map), we want to compute it on the fly, in run-time. That way we can compute it also for interaction with vehicles, opening bridges, animated objects, and so on. There is a catch though. For such a computation approach, we only have data that are visible on the screen (recall "screen space"), so once some part of the game world gets out of the visible frame, it can't be used for occlusion evaluation. This limitation creates various artifacts such as disappearing occlusion on a wall originally caused by an object that just got behind the edge of the screen and thus became invisible not just for you but also for the algorithm, so it ceased to contribute to occlusion computation.

Ok, now we know what to evaluate (ambient occlusion) and we know what data we have (what we see on screen). What do we do? Well, for each pixel on the screen (that is 2 million pixels in HD resolution, times four(!) in 400% scaling!) our shader code needs to query the z-buffer value of its surrounding pixels trying to get a notion of the geometric shape of the area surrounding it. We can do only a limited number of these "taps" as there is a steeply increasing performance cost with increasing tap count, this is an operation that is really taxing the 3D accelerator. The limit on the number of taps, in turn, affects the ambient occlusion precision (and in certain situations may create inaccuracies and banding). Imagine that you want to evaluate your surroundings on a 2-meter straight line, and are willing to spend 8 taps to approximate it. You query the line every 25 centimeters, and any detail smaller than that may happen to be totally unnoticed unless you are lucky and hit it spot on (or unlucky, because you may miss it the next frame so the surroundings would suddenly appear to change between frames and cause flickering). The further your algorithm probes, the less precise it is. So you need to limit the size of an area you analyze around each game pixel which in turn limits how far the AO 'sees' - that is why it is not suitable for computing occlusion in large spaces like under bridge arches.

We have mentioned that the technique of our choosing is horizon-based. This means that we are not probing the environment by shooting rays in the 3D world, instead, we analyze a hemisphere above/around each pixel to see how far it opens up until it is blocked, how much light is let in by the surrounding geometry using the z-buffer as our proxy. The hemisphere is actually approximated by several runs along a line rotated around the given pixel. If we can follow along this hemisphere in full, there is no occlusion. If the algorithm taps a value in the z-buffer that would block incoming light, it defines the level of occlusion. The algorithm is optimized for performance but its limitation is that once it hits anything, even possibly a small object, it stops probing any further. This may cause an "over occlusion" problem and may be spotted as a visual artifact when some relatively thin object such as a traffic sign post causes strong occlusion on a nearby wall. You can try to detect such small objects and skip them, which in turn may produce "under occlusion" on thin ledges. We have opted for the former.

There is also another interesting and useful property of horizon based techniques. Depending on how much of a hemisphere above a given pixel is occluded, you can compute the direction that is least occluded. The amount of occlusion can be thought of as an ice cream cone with varying apex angle oriented in that direction. This direction is called a "bent normal" and we use it for various light computations like for occluding a reflection on shiny surfaces. The idea is that if you look at the surface and the mirror-reflected direction gets out of this cone, we consider it (at least partially) occluded, tuning down the reflection intensity. The best way to see that effect is to look at bigger and round chrome parts, like the diesel tanks, with SSAO on and off.

So you see, the idea is not that hard, for an expert graphics programmer anyway ;), but there is a lot of computation involved, putting quite some strain on the 3D accelerator. So we have created several performance profiles, each using a mix of optimization techniques:
  • Using less taps per direction - it is faster but lets AO miss bigger objects than with finer sampling.
  • Reprojecting AO results from the previous frame - it lets us hide the artifacts from undersampling, but may create ghosting when reprojection fails (when what you see between frames changes a lot).
  • Rendering in half-resolution - reduces the number of computations to 1/4 but creates less fine AO - the result may be slightly blurry shadowing
We hope that all this info was interesting and useful for you. We're sending you a virtual high-five if you read this article to this point. You deserve a cookie and a big cup of hot chocolate! If you still want to get more details about this topic, feel free to check for example this link.

Thank you for your time and support and we will see you again at some of the next articles from the "Under the Hood" section we bring for our #BestCommunityEver from time to time.

No comments:

Post a Comment

Spam, offensive, hateful and other inappropriate comments will be removed and authors may be permanently restricted from commenting.

Note: Only a member of this blog may post a comment.