Tuesday, September 9, 2008

Event system filter mechanism refinement

Event systems aim to mediate communcation between software components while facilitating decoupling those components. Event systems avoid message senders from needing to know message recipients. They establish one-to-many relationships. Event systems tend to build upon the design patterns of "observer" and "mediator". The book "Distributed Event-Based Systems" (Muehl et al.) provides a taxonomy of cooperation models which illustrates the relationships between message senders and receivers, based on who initiates the transaction (consumer or producer) and whether the initial message recipient (addressee) is direct or indirect. Event systems are categorized as systems where the producer initiates communication and the recipient is indirect (i.e. sender does not know or care about the recipient).

Event systems work well when a single message can be used by multiple unrelated subsystems. Consider a simulation of a bunch of moving billiard balls which can collide, rebound from walls, and fall into pockets. An audio subsystem could subscribe to collision events to play a "click" sound when balls collide. Each ball can respond to collision events to apply an impulse to rebound from other balls and from walls. A gameplay system can monitor when balls fall into pockets to tally score and to remove balls from play.

Unfortunately, simple forms of event systems can distribute events to recipients who do not need every message of a given type. The problem of deciding what messages go to what recipients is called “filtering” and that’s the area where a simple event system needs refinement.

Reconsider the example problem. Each subsystem (physics, sound, gameplay) receives all collision events, but sometimes particular events are irrelevant to certain subsystems: A collision between a ball and a pocket should emit no sound. A collision between a ball and another ball should not cause any scoring change or remove balls from play. Also, all balls could receive all collision events between all ball pairs even if the ball is not involved in the pair. Such spurious, irrelevant events can be rejected by recipients after delivery but "post-delivery filtering" introduces avoidable overhead. If the number of billiard balls is large, the overhead can be larger than the legitimate processing time.

How can we augment an event system so that it has enough information to filter which recipients get which messages, in such a way that the event system remains generic and preserves desirable properties, e.g. that the sender does not know nor care about the receipients?

Several filter mechanisms have been implemented for event systems, including type-based, channels, subject-based and content-based.

The "observer" design patterns uses a subscription model tantamount to type-based filtering, where subscribers implement an interface which allows them to be notified of messages and they subscribe to the publisher via a subscription method. Each publisher object keeps a list of subscribers and notifies all subscribers when the appropriate event occurs. An event system built upon the "observer" and "mediator" design patterns therefore effectively uses type-based filtering. As revealed in the example above, type-based filtering still delivers all collision messages to all subsystems who care about particular subsets of collision messages, so we want to refine the system to use a better filtration mechanism.

Channels encompass sets of message types, and subscribers can subscribe to a channel, so they receive any message in the channel. Such a mechanism actually worsens the problem at hand in that we want to narrow the flow of communication, not widen it. Furthermore, the type-based system can already facilitate channels (although somewhat awkwardly) by subscribing a single object to multiple publishers. If it became useful, introducing the notion of channels would require only cosmetic changes to the event system; it would simply wrap the existing subscription interface with a sugar coating. At best, channels solve a different problem -- one not currently under consideration.

Subject-based filtering entails describing events and filters using keywords and filtration entails pattern matching. There are many problems with string-based approaches, including poor performance, ambiguity and rigidity. Read up on the subject if you want to know more, but for the immediate conversation I summarily dismiss this option.

Content-based filtering entails operating on the contents of the message itself. To be generic, the event messages have to be structured in a way that the event system can query, e.g. using name/value pairs. Furthermore, the filter itself has to be in a form the event system mediator can interpret, e.g. an expression language. This technique is quite powerful but also complicated and it's not clear that implementing a powerful, general mechanism like this provides any performance advantage over simply passing all recipients all messages of a given type, then letting the recipients decide whether each particular message is pertinent or not. (In the context where content-based filtering is employed -- distributed systems -- the message data and filtration expressions often reside on a different machine than the recipients, in which case the cost of distributing the messages is potentially much higher than for the present example, which runs within a single machine.)

Consider an extremely limited form of content-based filtering where each event message provides an integer (perhaps with arbitrary length) which we could call its "digest". The meaning of this value is unknown to the event mediator; only the publisher and subscribers assign meaning to this value. For example, it could be a unique identifier (e.g. each billiard ball could have an id), a bit field indicating various information (e.g. subtypes), a combination of these -- whatever. Meanwhile, each subscriber would provide a pair of integers: a mask and a value. When processing an event for delivery, the event mediator would apply the mask to the digest using bitwise-and, then compare the results to the value provided by the subscriber. Using this mechanism, the subscriber can state whether specific bits must be high or low, or whether some bits are irrelevant. Furthermore, the "identifier" and "subtype bit field" approaches can be merged together because, e.g., the upper N bits of the "digest" could correspond to "id" and the lower M bits could be the "subtype bit field". The important idea here is that the event mediator does not know or care what the bits mean -- only how to apply them as a filter. This system constitues an extremely simplified form of content-based filtering because it reduces the set of name/value pairs to bits (where names correspond to bit index ranges and values correspond to the bit values at those indices), and it reduces the filtration expression "language" to a single integer with a bitwise-and operator.

Sounds fast and simple -- elegant. Will it work? Is it sufficiently powerful to justify the addition?

[We'll discuss this at lunch on Friday, September 19, at Crispers in Winter Park Village.]

Update 2008sep19: Summary of lunch discussion:

Gabe, Dan, Aaron, Paul, Nathan and I attended. Everybody contributed to the discussion. Paul and Nathan reported their experiences with other event systems.

Paul described a channel-based system that he was not happy with, for reasons resembling those I pointed out in the blog -- insufficient filtering. The system he uses also has other issues, like massive switch-case filters within subscribers -- which misses one of the major points of using type-based filtering and one of the major benefits of event systems. He described an augmentation he implemented that took some inspiration from C# "delegates". His scheme uses C++ method pointers and uses the special and arcane .* and ->* method pointer invocation syntax. I have often thought about situations where that could be used but rarely (never?) seen it exploited, so this might be a good example of it. Paul also described a hypothetical system which involved layers of subscribers, each of which filter and selectively redistribute messages.

Nathan said his team evaluated Insomniac's Event/Delegate system but decided to write their own which uses type-based filtering, because their needs are simple. Their system suffers from the same problem described in the original post, that multiple subscriber objects receive messages of the proper type but not intended for that particular object. This situation apparently arises when the event system is used to postpone delivery -- i.e. where the publisher knows the subscriber. So to some extent one could claim this was out-of-domain usage for an event system, which is supposedly intended to decouple depencencies between systems, but in practice we all also use event systems to exploit the asynchronous nature, even with tight coupling between publisher and subscriber.

3 comments:

Dr. Michael J. Gourlay said...
This post has been removed by the author.
pdutka said...

So basically, we have a situation where there is a generic event type, like physics collision. If a listener were to subscribe to the "Physics Collision" event, then it will receive all types of collisions. The goal is to filter the event type before it is sent to the listener.

One way would be to have a filtered collection. A ball vs. ball, ball vs. pocket, etc., would all trigger a collision event, passing along an ID (say a bit field) for that collision type. The event mediator would then look through all of the listeners subscribed to the "Physics Collision" event, and only notify the listener if its ID filter marks that event type as desirable. While better, this still seams to suffer from the "for each matching event type, pass along" issue as occurs with channel based systems.

What if instead the inheritance model were used to filter events? With this approach, we could have varying degrees of filtration while maintaining the direct notification of publisher to describer that is the heart of an event system.

If we have "Physics Collisions" as an event, and each subsequent event type inherit from it, we can provide both the filtration we need, along with event specialization (i.e. provide extra data for that type of collision).

For example, the "Ball vs. Ball" and "Ball vs. Pocket" are both unique events that inherit from "Physics Collision", then each can be subscribed to individually. So the audio system can subscribe to "Ball vs. Ball" and only receive that event. Because that event inherits from a parent event, the parent event is also triggered, thereby notifying any subscribers to it has, such as the animation system.

Dr. Michael J. Gourlay said...

That's a good use case, and I believe the bitmask flavor of content-based filtering covers it. The event message can have bitfields for the various collision subcategories. Each subscriber can use a mask to identify which subcategories it cares about, and set its value accordingly. If the subscriber wishes to be more discriminating, then it can set more ones in the mask. If the subscriber wishes to receive a broader spectrum of messages then it can set more zeros in the mask.