Monday, November 3, 2008

Prototype-based programming language with fast dynamic binding

The game engine FIEA graduate students write each year (as part of their second programming class) uses a prototype-based programming paradigm that shares principles with object-oriented programming (OOP) but the limited version previously assigned lacks several desirable features, partly due to performance considerations. Prototype-based programming readily provides inheritance and encapsulation with no significant performance penalty but polymorphism and even common procedure calls can be expensive due to name resolution. As mentioned in Steve Yegge's article, "The Universal Design Pattern", prototype-based programming comes with some costs: performance and early verification. I want to explore the possibilities of creating a prototype-based programming language, intended for use in a game engine, that addresses the performance issue.

The FIEA game engine implements a very limited prototype-based programming language by using the prototype design pattern and composition to associate attributes with objects (called Entities). Attributes can be either prescribed at compile time or added at run time. The system uses factories to instantiate new objects. Each object implements a "clone" method (a virtual constructor) which duplicates the entity, its attributes and behaviors. Behaviors are something like procedures, implemented using the "strategy" design pattern (and we call these behaviors "Actions"), and objects use composition to contain lists of these behaviors. By cloning objects and modifying them, this simple system provides a simple form of inheritance. (By restricting access to members, we can encapsulate object members, although in some sense this provides no additional functionality, so the assignments omit this property.) The system is simple and somewhat powerful for its simplicity in that it allows game entities to have script-driven properties and behaviors.

The FIEA game engine contains a list of these Entities, and the simulation consists of processing each Entity for each frame by executing a special Action (dubbed the "sim" action). The game engine has a few other features but those are topics for future posts. Essentially (at least for the purpose of this post), the FIEA game engine consists of an ordered list of entities that have attributes and actions, and the "sim" action of each entity runs every frame.

[Note: In a subsequent post, I revisit binding and conclude that we really do want static binding after all. Furthermore, the presentation below suggests (without outright claiming so) that only dynamic binding allows closures, and certain other high-power language features like CPS. That is not the case; such features can be present with static binding. Despite these issues, the expositino below has considerably value and contains no explicit errors. So please read it, but beware that it suggests ideas which might not hold up to scrutiny.]

Past incarnations of this system use early (or static) binding (i.e. names are resolved when parsing configuration files and populating the system), so behavior execution runs relatively quickly, but the configuration language (I hesitate to call it a scripting language) lacks certain rudimentary features such as traditional procedure calls. The system lacks late (or dynamic) binding, i.e. there is no scope tree traversal to resolve names within the runtime environment, so this system lacks polymorphic "method calls" and closures. We could readily add dynamic binding but obvious general and simple solutions would entail run-time name resolution, which could be slow. (The system also has a scriptable event system which allows messages to be passed between Entities, and this provides method invocation somewhat akin to a limited form of how the programming language "Io" passes messages, but those invocations have excessive overhead.) The problem at hand entails adding more power (e.g. procedure calls) but without the performance costs associated with dynamic binding via scope traversal.

Actions in past incarnations of the FIEA game engine differ from traditional procedures in most other languages in that actions do not have positional parameters and cannot be invoked from another action; all parameters are accessed by name, and all actions are called by the core engine (not from script). (Actually, actions inside a "reaction" can be called from script, but ignore that for this post.) It might help to think of the former Entity-and-Action system as being closer to declarative than to imperative, inasmuch as the "language" used to define the system consisted of populating a tree with properties. The description language happens to be in XML and could probably just as well be in JSON. The motivation behind using only named properties and only sequences of predefined Actions has roots in the design goal of allowing a non-technical game developer (e.g. an artist or a producer) to assign attributes and behaviors via a drag-and-drop or other visual content creation user interface where the author does not need to know the syntax of a programming language. (Again, the notion of drag-and-drop programming is a topic for another post.) We therefore wish to preserve this trait of not requiring positional parameters in behavior configuration.

More common and fully-featured programming languages provide the ability to decompose problems into small pieces and solve them separately without the separation introducing undue overhead. In programming terms, it comes down to being able to make "local" function calls cheaply. ("Remote" function calls, which involve synchronizing execution across multiple cores, or accessing memory distributed across multiple cores, is a somewhat separate problem so let's leave that aside for now, except to mention that the existing event system is well-suited to that problem.) Introducing procedure calls into the FIEA game engine (polymorphic or not, late binding or early) should run fast, retain existing features and have a solution that can be implemented in a short amount of time (otherwise FIEA students would not have enough time to write it in a single semester).

Forms of dynamic binding and what they buy you:

The term "dynamic binding" has a variety of meanings depending on context so though I claimed that "the engine should have dynamic binding", the problem statement still requires quite a bit more refinement. Let's review various forms of dynamic binding and features they provide.

Procedure calls bind function arguments in the sense that the formal parameters are mapped to the actual parameters (or arguments) passed at the procedure call. This happens by placing values (in the scope of the caller) into registers or memory that are used locally within the procedure code (in the scope of the callee). This binding could be called static in the sense that the variables passed by the caller are determined at compile time. The only sense in which this binding is "dynamic" is in the limited (perhaps artificial, definitely not conventional) sense that semantically, the arguments refer to different variables at run time. And past FIEA engines lack even this rudimentary form of binding.

If a function can call itself or if a function can be called multiple times "simultaneously" then we say the language supports recursion. This implies that the same formal arguments can be bound to multiple sets of actual arguments (i.e. memory locations) simultaneously. If the mechanism used to pass arguments simply involved storing values into registers or fixed memory locations then a second simultaneous call to the same procedure would overwrite and obliterate the values in those locations, defeating any attempt to call recursively. To support recursion, usually procedure argument values are pushed onto a call stack at the caller side and extracted from the stack at the callee side. Just as with non-recursive function calls, this form of parameter binding does not conventionally get listed as dynamic binding.

Polymorphism via virtual function calls entails dynamic dispatch, i.e. which method gets called (i.e. receives the message passed) depends on the object on which that method is called (i.e. to which the message is passed). In this form of dynamic binding, the compiler cannot resolve which function to call so the decision is delayed until run time. In practice this happens by dereferencing a table of function pointers, and the object holds the address of the table -- so it actually involves a pair of pointer dereferences. Actually it usually involves three pointer dereferences since the object itself is usually referenced by a pointer. (If you had the object itself instead of a pointer to the object, then you wouldn't need polymorphism because you could resolve the method at compile-time.) But despite the fact that the function name gets bound at run time, the arguments are still bound at compile time; in this form of polymorphism, the function call always has the same "signature" or "form" i.e. the formal parameters are always the same, regardless of the object type.

The forms of dynamic binding discussed above apply to C++, Java and C#, but dynamic languages such as Lisp, Io and JavaScript include additional forms of dynamic binding. These include the dynamic lookup of so-called "free variables", the use of "continuations" instead of a call stack, and the use of "duck typing" to decide which methods shall handle messages passed to an object.

Prototype-based languages offer a form of inheritance where each object has a "parent" object used as a prototype to create the new object. The clone operation has at least two choices: Either have the child object hold a pointer to the parent or have the child duplicate all members (including all code and all data) of the parent. The former conserves memory, so seems desirable. If using that method then name resolution (i.e. binding) involves traversing parent pointers and looking for the attributes that match the name of the sought attribute. (Please read Steve Yegge's article for more detail.) Lookup therefore takes at least O(N) time where N is the number of ancestors the child has.

Some programming languages treat functions as first-class objects, meaning (among other things) they can be passed as arguments and returned as any other value. These so-called lambda functions are anonymous procedures which can be defined within the scope of another procedure. They can refer to at least 3 kinds of variables: local, arguments and "free". Free variables are those which are defined outside the context of the procedure body, e.g. inside the procedure containing the definition of the lambda function. (Formal language theory makes a distinction between bound and free variables although both require some form of binding, and in fact binding "free variables" entails a lookup which is more conventionally called "dynamic binding". In this nomenclature, "bound" variables refer to function arguments, which in the current context is admittedly confusing.) With lambda functions comes the problem (called the funarg problem) of deciding how to bind variables which are defined locally within the enclosing scope, after that scope as ostensibly disappeared. An example (in pseudocode) will help explain:


    int procedure(int x) makeIncBy(int x)
    {
      return int lambda(int y)
      {
        return x + y ;
      }
    }
    b = makeIncBy(1) ; // returns a procedure that adds 1 to its argument
    b(2) ; // returns 3



In this example, makeIncBy is a procedure that takes an integer argument (x) and returns a procedure which is defined inside makeIncBy. The procedure it returns (which has no name) takes an integer argument (y) and returns an integer (the sum of x and y). The problem is, when makeIncBy returns, its local variable (x), which traditionally gets allocated on the call stack, would normally no longer exist, yet the lambda function refers to that variable. If x is allocated on the stack then when b(2) is called (which invokes the lambda function, which refers to x, defined outside the lambda), then it seems as though the attempt to reference "x" would yield undefined behavior (a euphemism for saying that some random memory location would be accessed, leading to tragedy). This is the funarg problem, and it has at least two solutions: Disallow free variables or use closures. A closure is a "scope" which is associated with a function, which is used to dynamically bind a variable. Resolving (i.e. binding) a free variable thus entails searching closures for the nearest one which maps a name to a value. This is "real" dynamic binding for variables, and with it comes great power, especially when you remember that in languages with closures, variables themselves can refer to functions. So this technique allows you to write functions that create functions from other functions.

As an alternative to using a call stack, the system can use continuation passing style (CPS) which allows for continuations and coroutines. At first glance, CPS seems like an extraordinarily overwrought notion and excruciatingly difficult to use to write programs. In fact, writing code in CPS style would lead to early insanity. But its purpose is not to be used by humans for authoring code, but as a way to translate code automatically, behind the scenes, to facilitate continuations. It is useful at this stage to note that a "continuation" includes a function and its arguments, and the caller will not know the "signature" of the function until run-time. This contrasts sharply with polymorphism as present in C++ where the function signature is static. This discussion rapidly deviates on a tangential topic so I will leave the rest for another post. Meanwhile, suffice it to say that CPS solves useful problems in interesting ways and the dynamic binding problem solved by closures applies there also.

It might seem like closures and CPS provide solutions in search of arcane problems, especially in the current context of deciding how to implement the ability to bind names in a scripting language for games. After all, since the goal is to make a scripting language that non-technical people can use to author games, then esoteric features like anonymous procedures and concurrent coroutines probably lie well outside the abilities of the intended audience. Perhaps. We should explore some possibilities and see where that leads, but we might eventually return to these more powerful techniques.

There is a form of polymorphism which effectively combines the dynamic binding schemes discussed so far. The examples above include dynamic binding of two kinds of names: functions and variables. C++ and its ilk use "dynamic dispatch", i.e. dynamic binding of functions. Prototype-based languages can use "parent" pointers and dynamic programming languages use "closures" to represent scopes or environments which map names to values, which can be used to bind variables dynamically. The object-oriented programming paradigm refers to "passing messages between objects"; it does not refer to "calling procedures". C++ implements message passing (or a limited form of it) as a procedure call, and the compiler checks to make sure the object being passed the message actually has a method to handle that message. But in dynamic prototype-based languages like JavaScript, you can pass any message to any object, regardless of whether the object has a method that handles that message. This allows a form of type system called duck typing. This provides an extremely flexible form of polymorphism.

Dynamic Binding Options:

What form of dynamic binding should the game engine support, and how should it be implemented? It is possible that "how" has multiple answers depending on the what. As "Io" shows, it is possible to unify all of them, but Io values simplicity and power over performance, whereas the FIEA game engine values simplicity and performance over power. So instead of deciding which of the features above we want, consider the list above to be a menu without prices. Our decision of which items to choose depends on their price. So let us determine the prices: What implementations can we concoct, and which seem both simple and computationally efficient?

The system needs a name-to-value map, which we will call the Environment.

Hierarchical environment

Former version of the FIEA game engine implemented the Environment as a single flat dictionary where names have 2 parts: context and instance. This presented the appearance of a hierarchical namespace but with constant-time lookup on any symbol.

Consider defining a struct called Scope which maps names to values. In principle, each scope contains other scope objects. This would form a tree of scopes, or a hierarchical namespace with an actually hierarchical structure in memory. Traversing this namespace usually happens from the leave nodes upward so instead of (or in addition to) each scope containing children, each child should contain a reference to its parent. Each object (Entity or Action) would contain a pointer to its local scope. As mentioned above, name lookups would require O(N) time where N is the depth of the child.

We can flatten the tree traversal using something akin to "Lambda lifting": Convert each variable that resides in a parent scope into a local variable in the scope associated with the Action. In other words, all non-local attributes become local to the scope of an object.

In a similar spirit, any non-local variables accessed by an Action can be bound using tree traversal, then the results can be cached into the Action's local scope for future accessed. (This resembles a copy-on-write policy, although its goals are different.) But since all the referents are known at compile-time there is no obvious reason to delay this operation until run-time.

Imagine that each Scope does not know its children, and that children only know their parents. Action invocations could include a pointer to another scope object and name resolution would require looking at the passed-in scope (and its parents) prior to using the scope statically associated with the Action. In this way, any variable (free, bound or local) could be redefined by the caller, upon invocation. I will call this notion "scope tree surgery" since it effectively replaces one branch of the tree with another branch from another part of the tree. (The analogy is somewhat strained because the result of the surgery is to have one branch refer to another; the attached limb is never removed from the other part of the tree.) This technique provides immense power, including and exceeding even the most esoteric forms of dynamic variable binding techniques discussed above, including argement passing and closures. For example, even Lisp does not provide a mechanism for rebinding local variables inside a function. How could this seemingly dangerous and atypical property be useful? In the context of a game engine, each variable (including locals) stores properties which control behaviors, and we aim to make all behaviors tunable via script. Phrased differently, using this technique, all variables in an Action act as though they are arguments, all of which have default values. The caller can choose to change those variables or leave them alone. From the caller perspective, there is no distinction between function arguments, locals, and free variables.

Unfortunately, in terms of performance, this raises the cost of name binding considerably. It includes not only the usual cost of lookup associated with a more conventional scope tree used with stack-based or CPS function calling, but also this notion of searching the "passed in" scope prior to the usual scope (and/or closure). That is, usually, function arguments are passed by position, e.g. on a stack, and the callee unpacks those arguments and places the passed-in values into the memory locations of what are tantamount to temporary local variables. That process involves no lookups, just linear scan-and-unpack which can be written and automated prior to run-time.

Regardless of whether dynamic binding requires a tree traversal (i.e. even if we use some flavor of tree flattening), and even if the local scope uses a hash table, name binding will require some sort of lookup that costs more than a pointer dereference, and as it is, even pointer dereferences get scornful looks from optimization czars. Former versions of the FIEA game engine avoided this cost by caching all prescribed parameters (i.e. those known at compile-time) into member variables of the Action class. And (as mentined above) most programming languages make this "lookup" trivial by making it a hard-coded (thought automatically generated) register-to-stack (and vice versa) transfer, which relies on function arguments being fixed in size and order, at compile time. We want to preserve the property that Action variables are assigned by name, not by position, but we want the speed of positional parameters and structures (which both amount to lookups by offset).

Perhaps we can have our cake and eat it too, by prescribing parameter order within scopes. Instead of caching parameter addresses into member variables (as was done in the past), each scope would contain an array of pointers to Datum, and at compile-time the ordering of those would be known. When an Action gets invoked, a "replacement" scope would be populated but its ordering would remain the same as it was at compile-time. This trick would only help for prescribed parameters and it has some fragility issues but most of that noise gets hidden from the end user and could potentially be automated with some clever tools.

There is also a concept of "late static binding" which essentially amounts to rebinding a static member variable used in a method defined in a base object which a derived object inherits. The notion of "static" member variables only applies in languages which separate classes from instances, which is not the case in prototype-based languages.

My analysis here is somewhat bottom-up which conflicts with the usual edict of designing top-down. Top-down design asks first "what do you want to accomplish?" then asks "how do you accomplish that?", so application designers get exacltly what they want. But bottom-up allows for more performance because it first asks "what simple pieces can we make that run fast?" and second asks "what can we make from these pieces?". In truth we need to alternate between these 2 strategies until we find a way to make what we want using pieces we can afford.

I have left several loose ends in this post; what about duck typing, CPS, lambda functions and closures? Do we want them? At what price? Those are open for discussion.

Meanwhile, this post has outlined a way to inject traditional procedure calls into the FIEA game engine, without adding to the cost of accessing prescribed parameters. The current topic of discussion becomes, does this seem sane?

4 comments:

Nathan said...

What are your thoughts on "Game engine as operating system"? It seems to me like there's a school of thought in SW Eng that any sufficiently complicated piece of software will ultimately become so general as to become more of a platform or OS than application. Web browsers are the canonical example.

Is that what you're getting at with "Game engine as operating system"?

Dr. Michael J. Gourlay said...

I think "emacs" is another example of the philosophy you refer to.

But I have something else in mind.

In one sense, at their kernel (pun intended) operating systems actually do very few (yet very important) things, and it is this limited sense that I had in mind when I wrote that as a potential future topic.

So, no, I wasn't implying that game engines should or tend to grow with creeping featurism into all-encompasing, bloated pieces of software.

But since all of this is off the current topic, I won't here go into what I did actually mean.

I will, however, take your question as a vote for the topic of the next blog post and/or lunch discussion.

Dr. Michael J. Gourlay said...

I've recently had another look at Lua and I'm impressed and encouraged. It's a tiny little language that shares much in common with the ideas discussed in this post (prototype-based, coroutines, lambda functions). It also uses a register-based virtual machine (though it used to use a stack-based VM).

As an aside, former FIEA programming students will remember the reverse-polish-notation (RPN) lecture and some remember having to implement ActionExpr which evaluated an RPN expression. That shares much in common with a stack-based virtual machine.

This relates to the current discussion in that although the FIEA game engine configuration language (in its flavor of XML or Nathan's version, ISL) provides a way to define new functions (using ActionList) but provided no way to invoke them from script. (The engine itself always explicitly invoked them.) (Well, actually, reactions allow you to invoke them, but they have limitations and I'll have to return to that in a later post, but I'll mention again that this resembles how Io implements method passing.)

Point being that a natural way to introduce local synchronous function calls would be via the expression evaluator.

It just needs a syntax that is not disgusting.

Dr. Michael J. Gourlay said...

I've been thinking about this for a while and although I have oscillated between 2 possibilities:

() Procedure calls occur within expression evaluation.
() Procedure calls occur only as reactions to events.

In terms of the FIEA game engine as it existed in former incarnations, the former implies extending ActionExpr, and the latter requires no extension; ActionEvent already supplies what we need, and in fact we've already been doing that for a while -- just usually not directed to a specific entity. So actually, to operate efficiently, the event system would need the filtration mechanisms discussed in the earlier post.

I like both of these ideas, for different reasons, but I currently lean toward the latter (reuse Reaction) because it doesn't require additional effort by the students -- only additional insight.

The latter approach also gains some validity from the Io programming language, in which all "function calls" are effectively handled as passing messages via an event system.

One significant difference between the current Reaction system and function calls (as opposed to procedure calls) is that function calls return a value, where as currently Reactions do not. But that could be introduced without much hackery. The solution entails the very change to how the Registry is implemented that I described in the dynamic binding post.

I now regret mixing those topica together. I now feel that the Registry revisitation should have gone into a separate post that preceded the dynamic binding post.

Live and learn.