Causality and Event Processing

David Luckham and Roy Schulte

28 January 2023

Causality between events has been discussed throughout history. It has been confused with many other concepts, including for example, correlation, order of occurrence, and coincidence. The concept of cause also comes in many different types and flavors, necessary cause, sufficient cause, contributory cause, and probabilistic cause being popular examples.

The problem of how to determine causality between events arises immediately with any of the different types of cause and the kinds of events being discussed. This leads to the need to model the environment in which events are happening in order to define their causal relationships. We call this causal modelling.

Bertrand Russell illustrates the need for causal models brilliantly as follows:

“The average married couple has sexual intercourse about two thousand times during their married life and produces 2.1 children. If one were to slam a door two thousand times and hear two bangs, would one conclude that slamming the door produces the bang?”

What Russell has done here is to change the causal model within one breath! The first is the model of human reproduction and the second is the model of slamming doors.

The nature of causal models, and how to define them has been a subject of perennial discussion and recent books (see, for example, Judea Pearl’s “The Book of Why” or this video).

Too often we find ourselves doing event processing without actually formally specifying a casual model. We simply assume, perhaps subconsciously, that we know what causes what. But in fact, if we were to think about it, we would have to admit that that assumption was incorrect.

Causality between events depends on the properties of the environment in which those events happen. In many environments those properties are not obvious. We must define the causal models that specify how events cause one and other in the environments in which we want to do event processing.

In Complex Event Processing (CEP) we made certain decisions. The kind of causality we chose to deal with is called Computational causality between events. And its definition is as follows:

If an event, A, had to happen in order for an event B to happen, then A is a cause of B.

This is a very strong kind of causality and models the way computer programs operate (hence its name). Extending CEP to deal with other types of causality such as probabilistic cause, is considered a task for the future. However, we still need a causal model for any specific environment.

In any environment computational causality obeys some simple properties (or axioms).

If A is a cause of B and B is a cause of C then A is a cause of C.
If A is a cause of B then B cannot occur earlier than A with respect to any clock in the environment.
If A is a cause of B at one time, that does not imply that A will necessarily be a cause of B at another time.

In most environments causality is discrete which means that an event C will have events that are its immediate causes. That is, A is an immediate cause of C if there is no event B that is caused by A and also causes C. Also, if events A and B occur but neither is a cause of the other, we say that they are independent.

If we can define the causal relationship between events explicitly, we can make lots of questions about “what happened” and “what is happening” much easier to answer. Consequently, in event processing we always try to define a formal, documented causal model that defines the environment in which the events are happening and their causal relationships.

Causal models can be used in either of two ways:

They may be used for forensic investigation, helping to explain how a real scenario unfolded after the fact. Or they may be used to implement event simulation systems that enable researchers explore how systems will behave under various hypothetical conditions.

Forensic investigation

The analysis of two Boeing 737 MAX crashes is an example of forensic investigation. Analysts first looked at the events that directly caused the crashes, including failure of an angle-of-attack sensor, the response generated by the Maneuvering Characteristics Augmentation System (MCAS) flight control system, and the subsequent actions of pilots. This involved the use of (1) causal models defining how the aircraft flies and the effects of various flight controls, and (2) causal models specifying flight crew procedures in response to various flight situations. Analysts derived causal models by collecting data from event streams that were produced by sensors, flight recorders and other sources. The timing and relationships between the events were encoded into rules or equations that explained the immediate cause of the crash.

Over two years, analysts traced further back to earlier contributing events including aircraft and engine design decisions and factors such as the corporate culture in aircraft manufacturing, pilot training practices, and regulatory oversight decisions.

Some aspects of a 737 Max crash, particularly those directly related to the minutes before a crash, could be represented in a formal causal model. However, the overall scenario is complex and involves probabilistic, partial and other types of causation, so an end-to-end causal event model, although useful for guiding corrective responses, is partly subjective and uncertain.

Event simulation

In circumstances where one needs to predict how a system will behave, analysts can build event simulation systems that use CEP computational causal models to model the behavior of the system.

CEP causal models can be executed to predict behavior given a set of input events. A computational causal model in CEP is specified as a set of event pattern rules in which each rule describes the relationship (mapping) between sets of input event patterns and the resulting output (complex) event patterns that are caused by the input. If actual events are fed into a rule, it will specify the resulting output events. So a causal model can be executed to provide a simulation of an event system. Input events be taken from real world events, or they may be hypothetical.

Complex events in CEP simulations typically will include genetic vectors that contain information on their lineage, i.e., a list of the identifiers of the input events that caused them. As a simulation using a causal model continues processing, complex events computed in one stage may become the input for further complex events in the next stage. The result is an event abstraction hierarchy with higher level events derived by applying the causal model rules to events at lower levels. A more-complete explanation of computational causal models is presented in Chapter 11 of David Luckham’s “The Power of Events” book.

Conclusion

Causal models that are developed in the course of forensic investigation or that are developed to enable running simulations, have the same underlying purpose. The models are useful because they provide guidance on what did happen (forensic investigation), or what will happen (business management) in applying possible strategies to achieve outcomes in business situations. Causal models guide us so that we can make better decisions, such as improving business processes.