Sunday, November 30, 2008

Game Engine as Operating System: Part 1: Game loop versus kernel

Game engines and operating systems share much in common. Both provide a thin layer of abstraction above the bare
hardware
, providing facilities for managing computational resources such as processes, memory, files and other devices. Both have a core program (called a kernel in operating systems and informally called the game loop in game engines) that monitors and mediates processes that do the real work. Both tend to provide auxiliary (tangentially related) tools, libraries, scripting and user interface facilities that people think of being part of the system. And both provide the means to extend the system. Both game engines and operating systems provide an environment in which particular applications exist and operate.

[I vaguely remember that Doom provided something it called something like DoomOS, mention of which briefly scrolled by when you started Doom on a DOS machine. I can't find any mention of this on the internet so maybe I imagined this. If somebody can find a reference, please let me know, and if it turns out to be relevant to this article, I'll mention it here.]

When trying to explain the overarching principles and design of a game engine, I employ analogies. I spend 90 minutes of lecture time expounding the commonalities between game engine architecture and a theater, in an attempt to reveal a perspective on software architecture (in particular, game engine architecture) which includes not only what the audience experiences, but also involves actors, scenery builders, and prop makers and facilities people (janitors, electricians, plumbers, security guards, etc.). I expose this analogy in such a way to emphasize that only a tiny fraction of the real estate in a theater involves what the audience cognizes; most of the theater complex entails support functionality (seating, plumbing, wires, hallways, parking, docking bays, scenery shops, dressing rooms, lighting catwalks, scenery flies, sound and light booths, lobbies, concession stands, box office, etc.). By analogy I hope to drive home the point that most of the code in a game engine entails making and tuning the game (as opposed to playing the game). By describing these aspects of a theater I hope to ground game engine architecture in a familiar analogy to a physical space.

But still, year after year, I get feedback from students stating they go through the game engine architecture course without a sense of the big picture. Only at the end do students mentally assemble the various parts. Although one could dismiss their criticisms with, "at least they got the point eventually", one would hope to instill into students the tenets of top-down design. (Bottom-up design has its place also, but in a course aimed at teaching game engine architecture based on requirements gathering from stakeholders, top-down is more appropriate.)

Analogies provide a familiar map of conceptual terrain and help people orient themselves. Drawing an analogy between an operating system (which all computer users have used, and all FIEA programming students have studied as undergraduates) to a game engine (which nascent game programmers will write) helps clarify game engine architecture.

Operating system design is a relatively well-understood problem with relatively mature solutions, and since its goals resemble those of game engines, we can use OS design to guide game engine architecture design.

OS's have many parts and I will compare and contrast the structure of each of these parts in separate articles:


  • Process and thread management
  • Interprocess communication and synchronization
  • Process and CPU scheduling
  • Memory management
  • Device management
  • Storage and IO management
  • Access domains and security
  • Application development
Throughout these comparisons, we should keep an eye on how we can exploit operating system design principles to solve problems that occur in game engines, especially where game engines have traditionally struggled. Also, pay attention to ways the solutions provided by operating systems are inappropriate for game development, either because the problems lack sufficient overlap or because game development has much more specific problems requiring more focused solutions, and how this insight can help tackle the problems at hand.

The list of topics above is vast and would take volumes to exhaust -- but it is not my intent to write blog posts about how to perform (for example) memory and device management -- or at least, not under the heading of comparing OS's with game engines. I intend to focus on structure, i.e. how to set up a game engine such that these pieces can fall into place and result in an organized and elegant framework.

At their worst, game loops tend to be ad-hoc monolithic behemoths with hidden order-of-operation dependencies. They often resemble a novice programmer's first attempts at coding where all computations go into "main"; game loops often contain an outer "while" loop with a body consisting of a series of function calls. And if you swap the order of some of those calls, the code breaks in various ways, some catastrophic (which actually makes the problem easier to find and fix) or, at worst, subtle (like making the frame hitch a few times per second -- yes, that's happened to me). And if you add a new subsystem, often the "update" call gets tossed inside that gigantic loop. Where should it go? Under what situations should it run at all? What if you need to run something during loading screens when everything else is stopped? What if you want to stop running your new pet subsystem during pause screens? Unfortunately, much of the logic implementing these answers goes right into the main loop.

But what else would a game loop look like? How can we structure a game loop so that it works for every imaginable game, yet has elegant structure? Operating system kernels face the same problem. Imagine that each time you installed a new program onto your machine (or launched one), you had to modify the kernel, recompile, reboot and cross your fingers in hope that it doesn't panic.

OS's facilitate application processes so to make a meaningful analogy with a game engine, we should clarify what on the game engine side corresponds to a process or thread on the OS side. The answer depends on scope. Just as an application is a process that can have multiple threads, a game is a process that can have multiple entities and concurrent tasks. So, in this article, the analogous notion of an application or process will include the following:

  • simulation entities (e.g. bots, players, resources, assets) -- anything which has data and operations that act on that data
  • jobs -- self-contained tasks including code and data

The analogy between OS kernel and game engine main loop can get blurry. What maps to what? In a monolithic OS, the kernel treats internal modules (e.g. device drivers) and user processes quite separately. Likewise, a game engine has "internal" processes like input polling and "external" processes such as the physics, rendering, AI and controller handling for simulation entities. But even within OS concepts this distinction gets blurred; for example microkernels runs device drivers in user space -- so for a microkernel, device processing is "external" to the kernel. So the analogy actually works at multiple levels, in that we can treat various kinds of game processes as being either internal or external, where here "internal" could approximately mean game-agnostic and "external" means game-dependent.

This article focuses on kernel structure and how processes interact with the operating system. Maybe game engines can employ similar concepts and techniques -- or at least draw inspiration from OS design -- to provide structure and organization to the game loop.

Kernel structure

In "Operating System Concepts" (2005), Silberschatz, Galvin & Gagne categorized kernel structures as simple, layered, microkernel or modular. Let's consider each one as the basis for a game loop structure.

Simple

A simple game loop has no formal structure. This is a Bad Thing.

Game code call and dependency graphs often look like spaghetti. Code calls other code directly, regardless of what the caller and callee do. I have seen physics code calling AI code and vice-versa. I have seen rendering code call physics code and vice-versa. Everything seems to call UI code, and vice-versa. It's a mess.

As mentioned above, the main loop tends to have implicit and hidden order-of-operation dependencies that were neither intentional nor desirable. Sometimes the order of operations was well understood and planned but simply not documented. In other situations, the order of operations dependencies arose organically and only became apparent when somebody changed the order of operations, usually to try to fix a bug or add a feature, only to find that doing so mysteriously broke some other feature (and sometimes learning that took months, at which point the original cause was long past and therefore difficult to diagnose). Adding new subsystems tends to require hard-coding changes in the main loop.

The benefits of a "simple" main loop include that it's immediately obvious, by looking at the main loop code, what gets called (at least at the top layer). In contrast, more modular systems tend to use levels of indirection which hide such facts. By comparison, some C programmers complain that C++ code, with its method overriding and virtual functions, tends to hide which code really gets called. But as a veteran C and C++ coder I feel comfortable claiming that such complaints come only from the unenlightened and that after you get used to dealing with indirection habitually, the benefits of a decoupled system outweigh its lack of immediate transparency. So abandon simple main loops.

Layered

In a strictly layered system, code in one layer only has access to code in the layers below it. In principle, this simplifies writing and debugging because code at any given layer does not depend on layers outside of it. So the core layers tend to be smaller, simpler, easier to understand and easier to make correct.

Layered systems are fine once made but making them requires foresight, such as knowing which layer depends on what. That's not always obvious. Does the file system depend on the memory system, or vice-versa? It seems like the file system needs to store its contents in memory buffers, so file systems depend on memory systems. But if you want to collect statistics about memory usage and dump that information to a file, the dependencies become less clear. One might introduce a "logging" facility here to try to side-step the issue but that leads to a blind alley. In fact, this problem has a name, which is called a cross-cutting concern and the aspect-oriented programming paradigm attempts to address it. Layering presumes acyclic dependencies, which, while desirable, is not a reasonable expectation -- especially for fundamental services.

Modular

I worked on a game in which a system was developed to explicate module dependencies, but not order of operation dependencies. This system consisted of a collection of finite state automata which (as usual) consisted of a set of states and transitions. Each automata also had a collection of "modules" and each module could be associated with a state. Modules also contained a list of dependent modules. Modules had startup and shutdown operations. Whenever an automata transitioned from one state to another, the system made lists: Modules to shutdown, modules to startup and modules remaining intact. Also, each automata had a "process" which executed. The system could and did have multiple automata running "concurrently" (although behind the scenes they ran sequentially). This system organized and explicated dependencies and allowed extension of what ran in the "main loop" without changing main loop code, but did not provide any means to control the order the automata processes.

Furthermore, the execution of automata processes had no formal constraints, and code within each process tended to make function calls in all directions. So even though the "main loop" had structure, the rest of the code was still spaghetti.

One could argue that no amount of kernel structure can rescue application code from lacking structure. But can one provide an organization and communication framework that encourages proper modularity?

Microkernel

In operating systems, microkernel architectures isolate the bare minimum of code that must run in protected mode and relegate all other code (e.g. device drivers) to user space. So how do the various pieces (e.g. device drivers) communicate with each other in a microkernel? They use message passing. Game engines do not usually have dual-mode systems (or do they? Lua provides a "protected mode") but they have an analogous problem: the need for lightweight interprocess (or inter-entity) communication that maintains loose coupling but high performance.

I have heard arguments that formalizing and abstracting what could otherwise occur via a simple function call adds unnecessary overhead. I will argue against that simplified view and propose that the message passing formalism differs from its implementation, so that the formalism itself provides benefits without the drawbacks people naively assume come with the package.

The Mach and NT microkernels have a bad reputation for poor performance, but the L4 microkernel family proved that Mach's problems were due to implementation choices, not fundamental to the microkernel paradigm. The performance issues in Mach arose specifically because of its asynchronous interprocess communication scheme, which L4 addressed by passing messages synchronously and without excess copy and context switch overhead. A solution in a game system should take a page from the L4 book.

I advocate the use of a bare-bones game loop inspired by microkernel architectures, which provide these features:


  • Unified method to register and execute "processes" (configuring the game itself)
  • Providing formal coupling mechanisms to allow processes to communicate
  • Formal structure for augmenting the game engine itself

System calls and message passing

Operating systems provide services that user-mode applications use, and apps access these services using system calls. How system calls take place depend on who makes the request and who satisfies it.

Kernels implement system calls from user-space to kernel-space using interrupts (i.e. traps or hardware exceptions), where a process stores information (like the system call identifier and arguments) in registers and invokes a special instruction. The execution mode changes to protected, control passes to the kernel, and the kernel dispatches execution based on the system call identifier. For the most part, the interrupt mechanism has no useful analog in a game engine, but it's useful background information for contrasting with other modes of communication with the OS, which have more relevance to game engines.

Monolithic kernels treat intra-kernel communication (e.g. a device driver using the service of another kernel feature) as a regular function call, and since that has low overhead, they run fast. As mentioned above, however, monolithic kernels are unwieldy and unstable.

Microkernels unify all message passing via interprocess communication. That includes operations that, in a monolithic kernel, would be simple function calls. Message passing can be either asynchronous or synchronous. Asynchronous message passing allows processes to run independently yet still communicate. The kernel buffers all sent messages, providing them when the recipient can receive them. But asynchronous message passing comes with high costs: Context switches between protected and user modes happen twice as often, and message data gets copied twice as much. Synchronous message passing avoids those costs but imposes a restriction: Both threads must be ready to communicate simultaneously. Usually, that is the case, so this restriction does not cause undue limitations.

Event systems were the topic of an earlier post. Suffice it to say that I advocate providing an event system which supports both asynchronous and synchronous operation, and that for performance reasons you should prefer synchronous operation when it is possible.

Also, read up on the Io programming language. It uses event-based message passing abundantly.

Why wrap what could be implemented as a regular function call in an event? This seems especially useless when the operation is synchronous and bidirectional, i.e. when the called function returns a value. Benefits include debugging, logging and preparation for distributed computing. But those are topics for another post. The most immediate benefit comes from decoupling, or at least postponement of coupling; even if the sender and receiver (or caller and callee) have to know each other at run time, they do not need to know each other at compile-time. To send an event to a specific entity, the sender needs to know the entity's identifier, but it can discover that at run-time. Often, however, the sender of an event does not need to know the receiver, and that is where event-based message passing shines. In all cases, event-based message passing loosens coupling and looser coupling paves the way to more modular code, which is easier to understand, use, debug and maintain.

Game engine extension

One benefit of keeping the microkernel architecture is that all other operating system services (e.g. device drivers) have a unified interface for implementation and addition, which resembles any other user-mode process, like applications. We likewise expect the same benefit in our game engine. By stripping the game loop down to a bare minimum and facilitating inter-entity communication by message passing allows us to treat services (e.g. file, logging, network access, input handling, display, sound and so on) as any other entity. As far as the game loop is concerned, a device and a game entity act the same. Likewise, as far as any game entity is concerned, another game entity looks like a physical device.

Adding new devices and service subsystems therefore requires no changes to the game loop.

This might seem obvious, but rarely does a game engine use such a unified presentation for devices and game entities.

Analogy breakdown

As with all analogies, this one between game engines and operating systems eventually falls apart. (If the analogy held thoroughly then we wouldn't call it an analogy; we'd call the related things synonymous or homomorphic.) This particular analogy crumbles around the edges:

Operating systems have dual-mode operation which game engines lack and, as far as I can see, have no need for. That might change upon deeper consideration of the actor model of concurrency, but I'm not holding my breath.

Operating systems use timers to implement pre-emptive multitasking. Aside from timers being absent from game engines, the topic of multitasking is also more appropriate for a future article in this series that will focus on process scheduling.

Discussions of OS design also traditionally distinguish between policy and mechanism, the most common example of which is the notion that the kernel must perform the mechanism behind process and CPU scheduling, but the policies for doing so vary widely, and therefore should exist outside the microkernel. I can provide a strained analogy: Certain operations (such as rendering) require mediation but decisions like the number of rendering passes should depend on the particulars of individual games. But this analogy fails because the game loop itself does not need to perform the mediation. The problem is more apparent in OS kernels because of the protected/user dichotomy. For a game engine, no deep paradoxes exist as they do in an OS, and one could simply make more blanket statements like "make data-driven subsystems" and leave that discussion seperate from the game loop or game engine proper.

Conclusion

This article establishes the theme of comparing game engines to operating systems and sets up a chain of topics to explore in future articles. We now have a loose mapping between some pieces, such as game-loop / kernel, game-entity / process and a way to implement message passing to facilitate communication between entities. This article, in isolation, provides insufficient detail on how to implement these ideas, but this article is a setup -- a teaser -- for other posts in the same series. We have set the course to study operating systems to see what we can glean for use in game engine architecture design.

Future articles will explore how to structure process management, interprocess communication, process and task scheduling, device and I/O management, memory management, extension and customization, and integrating pieces into coherent wholes.

Monday, November 3, 2008

Prototype-based programming language with fast dynamic binding

The game engine FIEA graduate students write each year (as part of their second programming class) uses a prototype-based programming paradigm that shares principles with object-oriented programming (OOP) but the limited version previously assigned lacks several desirable features, partly due to performance considerations. Prototype-based programming readily provides inheritance and encapsulation with no significant performance penalty but polymorphism and even common procedure calls can be expensive due to name resolution. As mentioned in Steve Yegge's article, "The Universal Design Pattern", prototype-based programming comes with some costs: performance and early verification. I want to explore the possibilities of creating a prototype-based programming language, intended for use in a game engine, that addresses the performance issue.

The FIEA game engine implements a very limited prototype-based programming language by using the prototype design pattern and composition to associate attributes with objects (called Entities). Attributes can be either prescribed at compile time or added at run time. The system uses factories to instantiate new objects. Each object implements a "clone" method (a virtual constructor) which duplicates the entity, its attributes and behaviors. Behaviors are something like procedures, implemented using the "strategy" design pattern (and we call these behaviors "Actions"), and objects use composition to contain lists of these behaviors. By cloning objects and modifying them, this simple system provides a simple form of inheritance. (By restricting access to members, we can encapsulate object members, although in some sense this provides no additional functionality, so the assignments omit this property.) The system is simple and somewhat powerful for its simplicity in that it allows game entities to have script-driven properties and behaviors.

The FIEA game engine contains a list of these Entities, and the simulation consists of processing each Entity for each frame by executing a special Action (dubbed the "sim" action). The game engine has a few other features but those are topics for future posts. Essentially (at least for the purpose of this post), the FIEA game engine consists of an ordered list of entities that have attributes and actions, and the "sim" action of each entity runs every frame.

[Note: In a subsequent post, I revisit binding and conclude that we really do want static binding after all. Furthermore, the presentation below suggests (without outright claiming so) that only dynamic binding allows closures, and certain other high-power language features like CPS. That is not the case; such features can be present with static binding. Despite these issues, the expositino below has considerably value and contains no explicit errors. So please read it, but beware that it suggests ideas which might not hold up to scrutiny.]

Past incarnations of this system use early (or static) binding (i.e. names are resolved when parsing configuration files and populating the system), so behavior execution runs relatively quickly, but the configuration language (I hesitate to call it a scripting language) lacks certain rudimentary features such as traditional procedure calls. The system lacks late (or dynamic) binding, i.e. there is no scope tree traversal to resolve names within the runtime environment, so this system lacks polymorphic "method calls" and closures. We could readily add dynamic binding but obvious general and simple solutions would entail run-time name resolution, which could be slow. (The system also has a scriptable event system which allows messages to be passed between Entities, and this provides method invocation somewhat akin to a limited form of how the programming language "Io" passes messages, but those invocations have excessive overhead.) The problem at hand entails adding more power (e.g. procedure calls) but without the performance costs associated with dynamic binding via scope traversal.

Actions in past incarnations of the FIEA game engine differ from traditional procedures in most other languages in that actions do not have positional parameters and cannot be invoked from another action; all parameters are accessed by name, and all actions are called by the core engine (not from script). (Actually, actions inside a "reaction" can be called from script, but ignore that for this post.) It might help to think of the former Entity-and-Action system as being closer to declarative than to imperative, inasmuch as the "language" used to define the system consisted of populating a tree with properties. The description language happens to be in XML and could probably just as well be in JSON. The motivation behind using only named properties and only sequences of predefined Actions has roots in the design goal of allowing a non-technical game developer (e.g. an artist or a producer) to assign attributes and behaviors via a drag-and-drop or other visual content creation user interface where the author does not need to know the syntax of a programming language. (Again, the notion of drag-and-drop programming is a topic for another post.) We therefore wish to preserve this trait of not requiring positional parameters in behavior configuration.

More common and fully-featured programming languages provide the ability to decompose problems into small pieces and solve them separately without the separation introducing undue overhead. In programming terms, it comes down to being able to make "local" function calls cheaply. ("Remote" function calls, which involve synchronizing execution across multiple cores, or accessing memory distributed across multiple cores, is a somewhat separate problem so let's leave that aside for now, except to mention that the existing event system is well-suited to that problem.) Introducing procedure calls into the FIEA game engine (polymorphic or not, late binding or early) should run fast, retain existing features and have a solution that can be implemented in a short amount of time (otherwise FIEA students would not have enough time to write it in a single semester).

Forms of dynamic binding and what they buy you:

The term "dynamic binding" has a variety of meanings depending on context so though I claimed that "the engine should have dynamic binding", the problem statement still requires quite a bit more refinement. Let's review various forms of dynamic binding and features they provide.

Procedure calls bind function arguments in the sense that the formal parameters are mapped to the actual parameters (or arguments) passed at the procedure call. This happens by placing values (in the scope of the caller) into registers or memory that are used locally within the procedure code (in the scope of the callee). This binding could be called static in the sense that the variables passed by the caller are determined at compile time. The only sense in which this binding is "dynamic" is in the limited (perhaps artificial, definitely not conventional) sense that semantically, the arguments refer to different variables at run time. And past FIEA engines lack even this rudimentary form of binding.

If a function can call itself or if a function can be called multiple times "simultaneously" then we say the language supports recursion. This implies that the same formal arguments can be bound to multiple sets of actual arguments (i.e. memory locations) simultaneously. If the mechanism used to pass arguments simply involved storing values into registers or fixed memory locations then a second simultaneous call to the same procedure would overwrite and obliterate the values in those locations, defeating any attempt to call recursively. To support recursion, usually procedure argument values are pushed onto a call stack at the caller side and extracted from the stack at the callee side. Just as with non-recursive function calls, this form of parameter binding does not conventionally get listed as dynamic binding.

Polymorphism via virtual function calls entails dynamic dispatch, i.e. which method gets called (i.e. receives the message passed) depends on the object on which that method is called (i.e. to which the message is passed). In this form of dynamic binding, the compiler cannot resolve which function to call so the decision is delayed until run time. In practice this happens by dereferencing a table of function pointers, and the object holds the address of the table -- so it actually involves a pair of pointer dereferences. Actually it usually involves three pointer dereferences since the object itself is usually referenced by a pointer. (If you had the object itself instead of a pointer to the object, then you wouldn't need polymorphism because you could resolve the method at compile-time.) But despite the fact that the function name gets bound at run time, the arguments are still bound at compile time; in this form of polymorphism, the function call always has the same "signature" or "form" i.e. the formal parameters are always the same, regardless of the object type.

The forms of dynamic binding discussed above apply to C++, Java and C#, but dynamic languages such as Lisp, Io and JavaScript include additional forms of dynamic binding. These include the dynamic lookup of so-called "free variables", the use of "continuations" instead of a call stack, and the use of "duck typing" to decide which methods shall handle messages passed to an object.

Prototype-based languages offer a form of inheritance where each object has a "parent" object used as a prototype to create the new object. The clone operation has at least two choices: Either have the child object hold a pointer to the parent or have the child duplicate all members (including all code and all data) of the parent. The former conserves memory, so seems desirable. If using that method then name resolution (i.e. binding) involves traversing parent pointers and looking for the attributes that match the name of the sought attribute. (Please read Steve Yegge's article for more detail.) Lookup therefore takes at least O(N) time where N is the number of ancestors the child has.

Some programming languages treat functions as first-class objects, meaning (among other things) they can be passed as arguments and returned as any other value. These so-called lambda functions are anonymous procedures which can be defined within the scope of another procedure. They can refer to at least 3 kinds of variables: local, arguments and "free". Free variables are those which are defined outside the context of the procedure body, e.g. inside the procedure containing the definition of the lambda function. (Formal language theory makes a distinction between bound and free variables although both require some form of binding, and in fact binding "free variables" entails a lookup which is more conventionally called "dynamic binding". In this nomenclature, "bound" variables refer to function arguments, which in the current context is admittedly confusing.) With lambda functions comes the problem (called the funarg problem) of deciding how to bind variables which are defined locally within the enclosing scope, after that scope as ostensibly disappeared. An example (in pseudocode) will help explain:


    int procedure(int x) makeIncBy(int x)
    {
      return int lambda(int y)
      {
        return x + y ;
      }
    }
    b = makeIncBy(1) ; // returns a procedure that adds 1 to its argument
    b(2) ; // returns 3



In this example, makeIncBy is a procedure that takes an integer argument (x) and returns a procedure which is defined inside makeIncBy. The procedure it returns (which has no name) takes an integer argument (y) and returns an integer (the sum of x and y). The problem is, when makeIncBy returns, its local variable (x), which traditionally gets allocated on the call stack, would normally no longer exist, yet the lambda function refers to that variable. If x is allocated on the stack then when b(2) is called (which invokes the lambda function, which refers to x, defined outside the lambda), then it seems as though the attempt to reference "x" would yield undefined behavior (a euphemism for saying that some random memory location would be accessed, leading to tragedy). This is the funarg problem, and it has at least two solutions: Disallow free variables or use closures. A closure is a "scope" which is associated with a function, which is used to dynamically bind a variable. Resolving (i.e. binding) a free variable thus entails searching closures for the nearest one which maps a name to a value. This is "real" dynamic binding for variables, and with it comes great power, especially when you remember that in languages with closures, variables themselves can refer to functions. So this technique allows you to write functions that create functions from other functions.

As an alternative to using a call stack, the system can use continuation passing style (CPS) which allows for continuations and coroutines. At first glance, CPS seems like an extraordinarily overwrought notion and excruciatingly difficult to use to write programs. In fact, writing code in CPS style would lead to early insanity. But its purpose is not to be used by humans for authoring code, but as a way to translate code automatically, behind the scenes, to facilitate continuations. It is useful at this stage to note that a "continuation" includes a function and its arguments, and the caller will not know the "signature" of the function until run-time. This contrasts sharply with polymorphism as present in C++ where the function signature is static. This discussion rapidly deviates on a tangential topic so I will leave the rest for another post. Meanwhile, suffice it to say that CPS solves useful problems in interesting ways and the dynamic binding problem solved by closures applies there also.

It might seem like closures and CPS provide solutions in search of arcane problems, especially in the current context of deciding how to implement the ability to bind names in a scripting language for games. After all, since the goal is to make a scripting language that non-technical people can use to author games, then esoteric features like anonymous procedures and concurrent coroutines probably lie well outside the abilities of the intended audience. Perhaps. We should explore some possibilities and see where that leads, but we might eventually return to these more powerful techniques.

There is a form of polymorphism which effectively combines the dynamic binding schemes discussed so far. The examples above include dynamic binding of two kinds of names: functions and variables. C++ and its ilk use "dynamic dispatch", i.e. dynamic binding of functions. Prototype-based languages can use "parent" pointers and dynamic programming languages use "closures" to represent scopes or environments which map names to values, which can be used to bind variables dynamically. The object-oriented programming paradigm refers to "passing messages between objects"; it does not refer to "calling procedures". C++ implements message passing (or a limited form of it) as a procedure call, and the compiler checks to make sure the object being passed the message actually has a method to handle that message. But in dynamic prototype-based languages like JavaScript, you can pass any message to any object, regardless of whether the object has a method that handles that message. This allows a form of type system called duck typing. This provides an extremely flexible form of polymorphism.

Dynamic Binding Options:

What form of dynamic binding should the game engine support, and how should it be implemented? It is possible that "how" has multiple answers depending on the what. As "Io" shows, it is possible to unify all of them, but Io values simplicity and power over performance, whereas the FIEA game engine values simplicity and performance over power. So instead of deciding which of the features above we want, consider the list above to be a menu without prices. Our decision of which items to choose depends on their price. So let us determine the prices: What implementations can we concoct, and which seem both simple and computationally efficient?

The system needs a name-to-value map, which we will call the Environment.

Hierarchical environment

Former version of the FIEA game engine implemented the Environment as a single flat dictionary where names have 2 parts: context and instance. This presented the appearance of a hierarchical namespace but with constant-time lookup on any symbol.

Consider defining a struct called Scope which maps names to values. In principle, each scope contains other scope objects. This would form a tree of scopes, or a hierarchical namespace with an actually hierarchical structure in memory. Traversing this namespace usually happens from the leave nodes upward so instead of (or in addition to) each scope containing children, each child should contain a reference to its parent. Each object (Entity or Action) would contain a pointer to its local scope. As mentioned above, name lookups would require O(N) time where N is the depth of the child.

We can flatten the tree traversal using something akin to "Lambda lifting": Convert each variable that resides in a parent scope into a local variable in the scope associated with the Action. In other words, all non-local attributes become local to the scope of an object.

In a similar spirit, any non-local variables accessed by an Action can be bound using tree traversal, then the results can be cached into the Action's local scope for future accessed. (This resembles a copy-on-write policy, although its goals are different.) But since all the referents are known at compile-time there is no obvious reason to delay this operation until run-time.

Imagine that each Scope does not know its children, and that children only know their parents. Action invocations could include a pointer to another scope object and name resolution would require looking at the passed-in scope (and its parents) prior to using the scope statically associated with the Action. In this way, any variable (free, bound or local) could be redefined by the caller, upon invocation. I will call this notion "scope tree surgery" since it effectively replaces one branch of the tree with another branch from another part of the tree. (The analogy is somewhat strained because the result of the surgery is to have one branch refer to another; the attached limb is never removed from the other part of the tree.) This technique provides immense power, including and exceeding even the most esoteric forms of dynamic variable binding techniques discussed above, including argement passing and closures. For example, even Lisp does not provide a mechanism for rebinding local variables inside a function. How could this seemingly dangerous and atypical property be useful? In the context of a game engine, each variable (including locals) stores properties which control behaviors, and we aim to make all behaviors tunable via script. Phrased differently, using this technique, all variables in an Action act as though they are arguments, all of which have default values. The caller can choose to change those variables or leave them alone. From the caller perspective, there is no distinction between function arguments, locals, and free variables.

Unfortunately, in terms of performance, this raises the cost of name binding considerably. It includes not only the usual cost of lookup associated with a more conventional scope tree used with stack-based or CPS function calling, but also this notion of searching the "passed in" scope prior to the usual scope (and/or closure). That is, usually, function arguments are passed by position, e.g. on a stack, and the callee unpacks those arguments and places the passed-in values into the memory locations of what are tantamount to temporary local variables. That process involves no lookups, just linear scan-and-unpack which can be written and automated prior to run-time.

Regardless of whether dynamic binding requires a tree traversal (i.e. even if we use some flavor of tree flattening), and even if the local scope uses a hash table, name binding will require some sort of lookup that costs more than a pointer dereference, and as it is, even pointer dereferences get scornful looks from optimization czars. Former versions of the FIEA game engine avoided this cost by caching all prescribed parameters (i.e. those known at compile-time) into member variables of the Action class. And (as mentined above) most programming languages make this "lookup" trivial by making it a hard-coded (thought automatically generated) register-to-stack (and vice versa) transfer, which relies on function arguments being fixed in size and order, at compile time. We want to preserve the property that Action variables are assigned by name, not by position, but we want the speed of positional parameters and structures (which both amount to lookups by offset).

Perhaps we can have our cake and eat it too, by prescribing parameter order within scopes. Instead of caching parameter addresses into member variables (as was done in the past), each scope would contain an array of pointers to Datum, and at compile-time the ordering of those would be known. When an Action gets invoked, a "replacement" scope would be populated but its ordering would remain the same as it was at compile-time. This trick would only help for prescribed parameters and it has some fragility issues but most of that noise gets hidden from the end user and could potentially be automated with some clever tools.

There is also a concept of "late static binding" which essentially amounts to rebinding a static member variable used in a method defined in a base object which a derived object inherits. The notion of "static" member variables only applies in languages which separate classes from instances, which is not the case in prototype-based languages.

My analysis here is somewhat bottom-up which conflicts with the usual edict of designing top-down. Top-down design asks first "what do you want to accomplish?" then asks "how do you accomplish that?", so application designers get exacltly what they want. But bottom-up allows for more performance because it first asks "what simple pieces can we make that run fast?" and second asks "what can we make from these pieces?". In truth we need to alternate between these 2 strategies until we find a way to make what we want using pieces we can afford.

I have left several loose ends in this post; what about duck typing, CPS, lambda functions and closures? Do we want them? At what price? Those are open for discussion.

Meanwhile, this post has outlined a way to inject traditional procedure calls into the FIEA game engine, without adding to the cost of accessing prescribed parameters. The current topic of discussion becomes, does this seem sane?