Misinformed Cognoscenti: Game Engine as Operating System: Part 1: Game loop versus kernel

Game engines and operating systems share much in common. Both provide a thin layer of abstraction above the bare
hardware, providing facilities for managing computational resources such as processes, memory, files and other devices. Both have a core program (called a kernel in operating systems and informally called the game loop in game engines) that monitors and mediates processes that do the real work. Both tend to provide auxiliary (tangentially related) tools, libraries, scripting and user interface facilities that people think of being part of the system. And both provide the means to extend the system. Both game engines and operating systems provide an environment in which particular applications exist and operate.

[I vaguely remember that Doom provided something it called something like DoomOS, mention of which briefly scrolled by when you started Doom on a DOS machine. I can't find any mention of this on the internet so maybe I imagined this. If somebody can find a reference, please let me know, and if it turns out to be relevant to this article, I'll mention it here.]

When trying to explain the overarching principles and design of a game engine, I employ analogies. I spend 90 minutes of lecture time expounding the commonalities between game engine architecture and a theater, in an attempt to reveal a perspective on software architecture (in particular, game engine architecture) which includes not only what the audience experiences, but also involves actors, scenery builders, and prop makers and facilities people (janitors, electricians, plumbers, security guards, etc.). I expose this analogy in such a way to emphasize that only a tiny fraction of the real estate in a theater involves what the audience cognizes; most of the theater complex entails support functionality (seating, plumbing, wires, hallways, parking, docking bays, scenery shops, dressing rooms, lighting catwalks, scenery flies, sound and light booths, lobbies, concession stands, box office, etc.). By analogy I hope to drive home the point that most of the code in a game engine entails making and tuning the game (as opposed to playing the game). By describing these aspects of a theater I hope to ground game engine architecture in a familiar analogy to a physical space.

But still, year after year, I get feedback from students stating they go through the game engine architecture course without a sense of the big picture. Only at the end do students mentally assemble the various parts. Although one could dismiss their criticisms with, "at least they got the point eventually", one would hope to instill into students the tenets of top-down design. (Bottom-up design has its place also, but in a course aimed at teaching game engine architecture based on requirements gathering from stakeholders, top-down is more appropriate.)

Analogies provide a familiar map of conceptual terrain and help people orient themselves. Drawing an analogy between an operating system (which all computer users have used, and all FIEA programming students have studied as undergraduates) to a game engine (which nascent game programmers will write) helps clarify game engine architecture.

Operating system design is a relatively well-understood problem with relatively mature solutions, and since its goals resemble those of game engines, we can use OS design to guide game engine architecture design.

OS's have many parts and I will compare and contrast the structure of each of these parts in separate articles:

Process and thread management
Interprocess communication and synchronization
Process and CPU scheduling
Memory management
Device management
Storage and IO management
Access domains and security
Application development

Throughout these comparisons, we should keep an eye on how we can exploit operating system design principles to solve problems that occur in game engines, especially where game engines have traditionally struggled. Also, pay attention to ways the solutions provided by operating systems are inappropriate for game development, either because the problems lack sufficient overlap or because game development has much more specific problems requiring more focused solutions, and how this insight can help tackle the problems at hand.

The list of topics above is vast and would take volumes to exhaust -- but it is not my intent to write blog posts about how to perform (for example) memory and device management -- or at least, not under the heading of comparing OS's with game engines. I intend to focus on structure, i.e. how to set up a game engine such that these pieces can fall into place and result in an organized and elegant framework.

At their worst, game loops tend to be ad-hoc monolithic behemoths with hidden order-of-operation dependencies. They often resemble a novice programmer's first attempts at coding where all computations go into "main"; game loops often contain an outer "while" loop with a body consisting of a series of function calls. And if you swap the order of some of those calls, the code breaks in various ways, some catastrophic (which actually makes the problem easier to find and fix) or, at worst, subtle (like making the frame hitch a few times per second -- yes, that's happened to me). And if you add a new subsystem, often the "update" call gets tossed inside that gigantic loop. Where should it go? Under what situations should it run at all? What if you need to run something during loading screens when everything else is stopped? What if you want to stop running your new pet subsystem during pause screens? Unfortunately, much of the logic implementing these answers goes right into the main loop.

But what else would a game loop look like? How can we structure a game loop so that it works for every imaginable game, yet has elegant structure? Operating system kernels face the same problem. Imagine that each time you installed a new program onto your machine (or launched one), you had to modify the kernel, recompile, reboot and cross your fingers in hope that it doesn't panic.

OS's facilitate application processes so to make a meaningful analogy with a game engine, we should clarify what on the game engine side corresponds to a process or thread on the OS side. The answer depends on scope. Just as an application is a process that can have multiple threads, a game is a process that can have multiple entities and concurrent tasks. So, in this article, the analogous notion of an application or process will include the following:

simulation entities (e.g. bots, players, resources, assets) -- anything which has data and operations that act on that data
jobs -- self-contained tasks including code and data

The analogy between OS kernel and game engine main loop can get blurry. What maps to what? In a monolithic OS, the kernel treats internal modules (e.g. device drivers) and user processes quite separately. Likewise, a game engine has "internal" processes like input polling and "external" processes such as the physics, rendering, AI and controller handling for simulation entities. But even within OS concepts this distinction gets blurred; for example microkernels runs device drivers in user space -- so for a microkernel, device processing is "external" to the kernel. So the analogy actually works at multiple levels, in that we can treat various kinds of game processes as being either internal or external, where here "internal" could approximately mean game-agnostic and "external" means game-dependent.

This article focuses on kernel structure and how processes interact with the operating system. Maybe game engines can employ similar concepts and techniques -- or at least draw inspiration from OS design -- to provide structure and organization to the game loop.

Kernel structure

In "Operating System Concepts" (2005), Silberschatz, Galvin & Gagne categorized kernel structures as simple, layered, microkernel or modular. Let's consider each one as the basis for a game loop structure.

Simple

A simple game loop has no formal structure. This is a Bad Thing.

Game code call and dependency graphs often look like spaghetti. Code calls other code directly, regardless of what the caller and callee do. I have seen physics code calling AI code and vice-versa. I have seen rendering code call physics code and vice-versa. Everything seems to call UI code, and vice-versa. It's a mess.

As mentioned above, the main loop tends to have implicit and hidden order-of-operation dependencies that were neither intentional nor desirable. Sometimes the order of operations was well understood and planned but simply not documented. In other situations, the order of operations dependencies arose organically and only became apparent when somebody changed the order of operations, usually to try to fix a bug or add a feature, only to find that doing so mysteriously broke some other feature (and sometimes learning that took months, at which point the original cause was long past and therefore difficult to diagnose). Adding new subsystems tends to require hard-coding changes in the main loop.

The benefits of a "simple" main loop include that it's immediately obvious, by looking at the main loop code, what gets called (at least at the top layer). In contrast, more modular systems tend to use levels of indirection which hide such facts. By comparison, some C programmers complain that C++ code, with its method overriding and virtual functions, tends to hide which code really gets called. But as a veteran C and C++ coder I feel comfortable claiming that such complaints come only from the unenlightened and that after you get used to dealing with indirection habitually, the benefits of a decoupled system outweigh its lack of immediate transparency. So abandon simple main loops.

Layered

In a strictly layered system, code in one layer only has access to code in the layers below it. In principle, this simplifies writing and debugging because code at any given layer does not depend on layers outside of it. So the core layers tend to be smaller, simpler, easier to understand and easier to make correct.

Layered systems are fine once made but making them requires foresight, such as knowing which layer depends on what. That's not always obvious. Does the file system depend on the memory system, or vice-versa? It seems like the file system needs to store its contents in memory buffers, so file systems depend on memory systems. But if you want to collect statistics about memory usage and dump that information to a file, the dependencies become less clear. One might introduce a "logging" facility here to try to side-step the issue but that leads to a blind alley. In fact, this problem has a name, which is called a cross-cutting concern and the aspect-oriented programming paradigm attempts to address it. Layering presumes acyclic dependencies, which, while desirable, is not a reasonable expectation -- especially for fundamental services.

Modular

I worked on a game in which a system was developed to explicate module dependencies, but not order of operation dependencies. This system consisted of a collection of finite state automata which (as usual) consisted of a set of states and transitions. Each automata also had a collection of "modules" and each module could be associated with a state. Modules also contained a list of dependent modules. Modules had startup and shutdown operations. Whenever an automata transitioned from one state to another, the system made lists: Modules to shutdown, modules to startup and modules remaining intact. Also, each automata had a "process" which executed. The system could and did have multiple automata running "concurrently" (although behind the scenes they ran sequentially). This system organized and explicated dependencies and allowed extension of what ran in the "main loop" without changing main loop code, but did not provide any means to control the order the automata processes.

Furthermore, the execution of automata processes had no formal constraints, and code within each process tended to make function calls in all directions. So even though the "main loop" had structure, the rest of the code was still spaghetti.

One could argue that no amount of kernel structure can rescue application code from lacking structure. But can one provide an organization and communication framework that encourages proper modularity?

Microkernel

In operating systems, microkernel architectures isolate the bare minimum of code that must run in protected mode and relegate all other code (e.g. device drivers) to user space. So how do the various pieces (e.g. device drivers) communicate with each other in a microkernel? They use message passing. Game engines do not usually have dual-mode systems (or do they? Lua provides a "protected mode") but they have an analogous problem: the need for lightweight interprocess (or inter-entity) communication that maintains loose coupling but high performance.

I have heard arguments that formalizing and abstracting what could otherwise occur via a simple function call adds unnecessary overhead. I will argue against that simplified view and propose that the message passing formalism differs from its implementation, so that the formalism itself provides benefits without the drawbacks people naively assume come with the package.

The Mach and NT microkernels have a bad reputation for poor performance, but the L4 microkernel family proved that Mach's problems were due to implementation choices, not fundamental to the microkernel paradigm. The performance issues in Mach arose specifically because of its asynchronous interprocess communication scheme, which L4 addressed by passing messages synchronously and without excess copy and context switch overhead. A solution in a game system should take a page from the L4 book.

I advocate the use of a bare-bones game loop inspired by microkernel architectures, which provide these features:

Unified method to register and execute "processes" (configuring the game itself)
Providing formal coupling mechanisms to allow processes to communicate
Formal structure for augmenting the game engine itself

System calls and message passing

Operating systems provide services that user-mode applications use, and apps access these services using system calls. How system calls take place depend on who makes the request and who satisfies it.

Kernels implement system calls from user-space to kernel-space using interrupts (i.e. traps or hardware exceptions), where a process stores information (like the system call identifier and arguments) in registers and invokes a special instruction. The execution mode changes to protected, control passes to the kernel, and the kernel dispatches execution based on the system call identifier. For the most part, the interrupt mechanism has no useful analog in a game engine, but it's useful background information for contrasting with other modes of communication with the OS, which have more relevance to game engines.

Monolithic kernels treat intra-kernel communication (e.g. a device driver using the service of another kernel feature) as a regular function call, and since that has low overhead, they run fast. As mentioned above, however, monolithic kernels are unwieldy and unstable.

Microkernels unify all message passing via interprocess communication. That includes operations that, in a monolithic kernel, would be simple function calls. Message passing can be either asynchronous or synchronous. Asynchronous message passing allows processes to run independently yet still communicate. The kernel buffers all sent messages, providing them when the recipient can receive them. But asynchronous message passing comes with high costs: Context switches between protected and user modes happen twice as often, and message data gets copied twice as much. Synchronous message passing avoids those costs but imposes a restriction: Both threads must be ready to communicate simultaneously. Usually, that is the case, so this restriction does not cause undue limitations.

Event systems were the topic of an earlier post. Suffice it to say that I advocate providing an event system which supports both asynchronous and synchronous operation, and that for performance reasons you should prefer synchronous operation when it is possible.

Also, read up on the Io programming language. It uses event-based message passing abundantly.

Why wrap what could be implemented as a regular function call in an event? This seems especially useless when the operation is synchronous and bidirectional, i.e. when the called function returns a value. Benefits include debugging, logging and preparation for distributed computing. But those are topics for another post. The most immediate benefit comes from decoupling, or at least postponement of coupling; even if the sender and receiver (or caller and callee) have to know each other at run time, they do not need to know each other at compile-time. To send an event to a specific entity, the sender needs to know the entity's identifier, but it can discover that at run-time. Often, however, the sender of an event does not need to know the receiver, and that is where event-based message passing shines. In all cases, event-based message passing loosens coupling and looser coupling paves the way to more modular code, which is easier to understand, use, debug and maintain.

Game engine extension

One benefit of keeping the microkernel architecture is that all other operating system services (e.g. device drivers) have a unified interface for implementation and addition, which resembles any other user-mode process, like applications. We likewise expect the same benefit in our game engine. By stripping the game loop down to a bare minimum and facilitating inter-entity communication by message passing allows us to treat services (e.g. file, logging, network access, input handling, display, sound and so on) as any other entity. As far as the game loop is concerned, a device and a game entity act the same. Likewise, as far as any game entity is concerned, another game entity looks like a physical device.

Adding new devices and service subsystems therefore requires no changes to the game loop.

This might seem obvious, but rarely does a game engine use such a unified presentation for devices and game entities.

Analogy breakdown

As with all analogies, this one between game engines and operating systems eventually falls apart. (If the analogy held thoroughly then we wouldn't call it an analogy; we'd call the related things synonymous or homomorphic.) This particular analogy crumbles around the edges:

Operating systems have dual-mode operation which game engines lack and, as far as I can see, have no need for. That might change upon deeper consideration of the actor model of concurrency, but I'm not holding my breath.

Operating systems use timers to implement pre-emptive multitasking. Aside from timers being absent from game engines, the topic of multitasking is also more appropriate for a future article in this series that will focus on process scheduling.

Discussions of OS design also traditionally distinguish between policy and mechanism, the most common example of which is the notion that the kernel must perform the mechanism behind process and CPU scheduling, but the policies for doing so vary widely, and therefore should exist outside the microkernel. I can provide a strained analogy: Certain operations (such as rendering) require mediation but decisions like the number of rendering passes should depend on the particulars of individual games. But this analogy fails because the game loop itself does not need to perform the mediation. The problem is more apparent in OS kernels because of the protected/user dichotomy. For a game engine, no deep paradoxes exist as they do in an OS, and one could simply make more blanket statements like "make data-driven subsystems" and leave that discussion seperate from the game loop or game engine proper.

Conclusion

This article establishes the theme of comparing game engines to operating systems and sets up a chain of topics to explore in future articles. We now have a loose mapping between some pieces, such as game-loop / kernel, game-entity / process and a way to implement message passing to facilitate communication between entities. This article, in isolation, provides insufficient detail on how to implement these ideas, but this article is a setup -- a teaser -- for other posts in the same series. We have set the course to study operating systems to see what we can glean for use in game engine architecture design.

Future articles will explore how to structure process management, interprocess communication, process and task scheduling, device and I/O management, memory management, extension and customization, and integrating pieces into coherent wholes.

3 comments:

Nathan said...: This was the most interesting article thus far.

I enjoyed contemplating the logical extreme of this design analogy. I don't have much experience with OSes, but it seems like the result of such an extreme might be an engine in which each entity/actor/object is actually its own process/thread (at the OS or even hardware level). This kind of a structure, which I assume is what you meant by the "actor model of concurrency", may lend itself well to a many-core environment, though it would obviously require a much more structured system for communication between the game objects.

It is interesting to note that games and OSes are two of the fields of software that are most resistant to moving up the chain of abstraction in terms of programming languages. I expect that both OSes and games will continue to be written in a close-to-the-metal language for a long time to come.

As engines become more general, it may be unnecessary -- even more so than today -- to share personnel between game teams and central tech teams. It even seems likely that engine teams may be operating in an entirely different language than the "application" or game teams that are building the systems to run atop their engines. I could envision a three-layered approach (layers are from a language standpoint), wherein core tech teams build an engine in a low-level language, like C or C++; application teams build modular systems to sit atop the engine in an application language -- compiled, perhaps, but likely to an intermediate language like CLR bytecode (Java or C#); scripter/technical designers would connect the modules and create game specific glue code in a much higher level scripting language like Python, Lua, or Ruby.

I know programmers and producers that could fit comfortably at every level in that hierarchy, and it seems like it would leverage more developers' talents without creating the unfortunate situation in which someone who could otherwise contribute to a project is not able to because of technical limitations, or -- worse still -- they contribute, but their contributions require work by other members of the team to disentangle their technical faux pas.; December 1, 2008 at 11:04 PM
Dr. Michael J. Gourlay said...: Thanks for letting me know that this topic interests you. Ironically, while and after I wrote it, I had the thought that "nobody will find this interesting; why do I bother with this topic?". So I learned that I have a poor ability to predict what people find interesting. So please do feel free to continue to steer the topic.

In your statement, "each entity/actor/object is actually its own process/thread", you correctly anticipate one of the future posts in this miniseries, and furthermore I agree that this might not seem like as much of an extreme when the number of CPU's increases.

Remember that I come from a background where I wrote and used software intended to run on hundreds or thousands of CPU's, so for me, 3 to 8 cores is awkwardly small -- more than 1 but almost not enough to bother multithreading, or at least, requiring a drastically different approach to multithreading than what I'm used to. And experience in the industry right now reflects that; SPU's go unused and cores go idle. People are much more concerned about optimizing every last ounce of the GPU, which effectively has hundreds of parallel computational units.

Your comment about how game engines tend to snuggle up to the metal is a gripe that a few of us in the industry often share. Iestyn Bleasdale-Shepherd and I often bemoan the fact that some people consider a 20% gain in efficiency worth the monumental effort people put into platform-specific optimizations, where I eschew such optimizations and espouse algorithmic changes.

I think your idea of a 3-layer approach is interesting, and I hadn't considered that before. I have thought about the dichotomy of visual languages (like the dataflow paradigm) and wondered whether they had a place alongside other scripting languages like Lua and Python. In fact, I had a conversation with Paul Varcholic and Dan Mapes about this very topic about 1 week ago, and it's a topic likely to continue as the Media Convergence Lab struggles to find ways to make their production team more efficient.

Anyway, I thought I might interleave the "Game Engine as OS" miniseries with other topics to avoid monotony, but maybe next month I'll proceed with the "process management" part of this series to maintain some continuity.; December 2, 2008 at 10:59 PM
Nathan said...: I vote for a continuation of this topic.; December 3, 2008 at 11:36 AM

Misinformed Cognoscenti

Sunday, November 30, 2008

Game Engine as Operating System: Part 1: Game loop versus kernel

3 comments:

Pages

Blog Archive

About Me

Related websites

Books related to game engine architecture

What event system filtration mechanism do you use?