In 2017, we started a cooperation with Arm, the UK-based chip designer, whose chips are found in over 95% of the world's smartphones. Initially, the motivation for reaching out to Arm was to see if we can get some help from them regarding the profiling and debugging tools they provide freely for their hardware. Needless to say, optimizing games for high FPS on a wide range of mobile devices, especially the mid-ranges, is not an easy task... and we were wondering whether we had done all that we could on our side. Therefore, any tips or advice from the people who designed them, was bound to help.

Not exactly knowing what we were looking for, and whom to contact, our initial talks with Arm engineers were mostly exploratory. However, the word quickly reached the right place and very soon we were talking with Arm's Director of Ecosystems, Pablo Fraile. One of the purposes of the eco-system is to provide help to all companies involved in production of software, or hardware, that uses Arm's chips.

Along with Arm’s eco-system engineering sub-team, the idea was to get help in interpreting the data reported by debug tools, and to see if there’s room for improvement. We came to the conclusion that their team could help us achieve better visuals in our then soft-launched game, Spellsouls. More precisely, the idea was to try to optimize parts of the rendering process in order to make room for applying more visually appealing effects at a high FPS on mid-range phones. As a result of this effort, Arm’s team proposed that together we create a demo of Spellsouls that would feature all the optimizations and show it at GDC, Unite and possibly other conferences.

Our big pain point was achieving the visual effect where brightly lit metals shine so intensively that the light appears to spill over. This effect is called bloom. It is usually achieved through post-processing, which is rather costly on mobile. We wanted to maintain high frame rates (the famous 60 FPS) even on mid-range devices, while sticking to the realistic fantasy art style. With our approach, there was no room left for post-processing.

After a while, Arm's engineers proposed a three part solution to this problem.

Part 1

One is the well-known billboard technique, mostly intended for low-end devices, being computationally cheap. See the following pictures.

Figure 1: Spellsouls battle, shiny metal spots represented with billboards.
Figure 2: Detail from the previous picture.

Part 2

The second technique is for showing bloom on environment. It is somewhat more complex. It involves actual computation of view and light directions, and relies on a lookup table encoded in a texture, which is good because it is completely under artistic control. Despite not being the real (post-processing) bloom effect, depending on the artistic vision of the game and taking care not to expose the technique's downsides, it can give pretty good results - as can be seen on the following pictures.

Figure 3: Battleground with metallic structures embedded in the building. The technique was not applied here.
Figure 4: After applying the technique. The metallic parts become shiny and glowing.

Part 3

Finally, the third and last technique was the most successful performance wise. Basically, this technique renders the additive lights in a separate pass, but in lower resolution. The final output is composed by adding the additive lights pass to the pass with only directional light. So, the overall cost of additive lights is dramatically reduced, albeit at the cost of reduced precision. Frankly, it is practically unnoticeable, especially during an intense game and on a small phone screen. With this technique, the 60 FPS target with post-processing became achievable for mid-range devices.

Conclusion

For me, though simple, this technique was eye-opening. What I learned from it is that the code (especially the shader code) should not be observed as a sequence, but rather as a composition of particular operations. Not all operations require the same precision. So instead of looking at the code, the better approach would be to draw a diagram and see what parts are independent and can be computed at different resolution or precision.

Unite Berlin, 2018

Check out my talk with José Emilio Muñoz López, from Arm, at Unite Berlin 2018. In it, we explain how to profile your game using the Arm tools, alongside identifying and resolving pipeline bottlenecks with targeted optimization (e.g. shaders). Using our game Spellsouls: Duel of Legends as a case study, we look at how to optimize the rendering pipeline and budget for efficient post-processing effects, such as bloom at high FPS.

I’d also encourage you to check out Arm’s detailed blog on the solutions mentioned above - Post-processing Effects on Mobile: Optimization and Alternatives.