Real-time intelligence and how it uses complex-event processing (CEP)
by W. Roy Schulte and David Luckham
Frequently Asked Questions
- Where do real-time intelligence and CEP fit in the field of business analytics?
- Where does real-time information come from?
- What is real-time intelligence?
- What are the business benefits of real-time intelligence?
- What does “real-time” mean in real-time intelligence?
- Does fast response time to a query imply real-time intelligence?
- How is CEP related to real-time intelligence?
- What is CEP?
- Are there different kinds of CEP?
Real-time intelligence is part of a broad movement toward increased use of analytics in business. It is directed specifically at the day-to-day and minute-to-minute operational decisions that are made in the course of the processes that run a business. Decisions are made in real time in contrast to the more traditional, offline use of business intelligence and analytics to make tactical and strategic decisions. Companies that implement active, continuously-running real-time intelligence systems will leverage CEP because it is the only way to extract insights from current data in an event-driven manner. Companies that understand CEP have more and better real-time intelligence than those that don’t understand it. The use of CEP will expand further as the pace of business accelerates, more data becomes available in real-time, and business people demand better situation awareness.
Real-time information is digital data that is available to a business from many sources, both internal to the company and external, and is processed to become information. Data sources may include simple monitoring systems such as physical sensors, RFID tag readers, and such like, but more often are sources of higher level data including internal application systems, control systems, email and cellular communication systems, market data feeds, web-based news feeds, and social computing platforms such as Twitter and Facebook. Raw data from data feeds becomes intelligence after it has been filtered and abstracted through analytic processes that make it useful for making business decisions.
Real-time intelligence is a discipline that provides relevant information based on processing data received in real-time. Real-time intelligence is time-sensitive information that is essential when decisions are made and corresponding actions are taken quickly to avoid undesirable outcomes (threats), or to take advantage of fleeting opportunities to achieve desirable outcomes.
The benefits of real-time intelligence are obvious (also see Use Intelligent Business Operations to Create Business Advantage). The value of information to improve a decision generally deteriorates over time. The half-life of information value may be measured in hours, minutes or milliseconds, depending on the business situation:
- Call Center Management: When your customer contact center experiences a spike in call volume, you may want to reassign customer service representatives to handle incoming calls, or temporarily switch to shorter call scripts, within a few minutes. You need to know when the number of incoming calls is increasing and the average wait times are growing longer. If you wait for an overnight report on the previous day’s calls to understand what happened, you would fail to achieve your service level targets and miss the opportunity to make more sales. In this example, the raw information is the stream of incoming phone calls, and the real-time intelligence is the key performance indicators (KPIs) on call volume and wait times.
- Dynamic Pricing: Your web site may need to compute a price dynamically, or generate a “best next action” cross-selling offer within a few seconds, so that a customer doesn’t lose attention and abandon their on-line shopping cart. The raw information in this case is a combination of customer history, their behavior on the web site within the past few seconds or minutes, and data on sales made to other people earlier in the day. The intelligence derived from the raw information is the suggested price or offer.
- Fraud Detection: Real-time fraud detection algorithms are applied to streams of electronic data that contain credit card transactions and bank account operations to evaluate the risk of losses in real-time before a fraudulent transaction completes. The systems compute the probability of fraud (intelligence) from raw information that includes the location of the person making the transaction, the recent history of other transactions made against this account, reference data for the account and known fraud patterns.
- Financial Trading: High-frequency trading systems in financial markets are even more sensitive to time. Some buy and sell decisions are made in less than a millisecond after receiving new data from an exchange. An opportunity will be gone when prices change or a competitor has grabbed the deal. A particular calculation can be worth thousands or millions of dollars if action is taken within .5 milliseconds, but be worthless one millisecond later. These systems are fully automated, because there is no time to involve a person in an individual trade.
Every decision has a “right time.” Intelligence that arrives too soon may be wasted because the decision maker isn’t ready to absorb it, or it may cost too much to implement because systems that run quickly usually have higher IT costs. However, intelligence that is too late may result in losses to the business side of the company, or may cause the company to miss an opportunity to profit.
We are using the term “real time” loosely, to include “near-real-time” or “business real time.” The analytics are performed on fresh data — data about events that have occurred within the past few milliseconds, seconds or minutes. We use the arbitrary guideline that less than 15 minutes is recent enough to be considered “real time,” but there is no bright line between real time and recent history. Some observers use the term “right now time” to mean as fast as makes no difference to the decision being made.
Engineers use a more-demanding definition of real-time. “Hard” real time requires deterministic latency. This means that the process will always complete within a fixed time frame (unless the system is broken, of course). The maximum duration of the process can be determined in advance, and no set of coincidences will cause the process to complete after the predicted deadline. A process that involves a human can’t be hard real time because people’s actions aren’t entirely foreseeable. Even a fast computer system isn’t hard real time unless the application is implemented using a real-time operating system and special software designs, because things like Java garbage collection, network irregularities or internal scheduling anomalies can add small unpredictable delays.
Real-time intelligence is about fast analysis of current data, not fast analysis of old data. Sometimes people wrongly use the term “real-time intelligence” or “real-time analytics” to describe a query made to a data warehouse or data mart because the answer comes back within a few seconds. Fast query responses are a good thing, but they aren’t real-time intelligence if all of the data is more than 15 minutes old. Some of the data used for real-time intelligence is usually old, but at least some of the data must be fresh for the application to be called real-time intelligence.
Most systems that provide real-time intelligence execute some CEP. CEP is inherent in the process of turning raw event data into usable intelligence, although in many cases software architects, developers and users don’t realize that they are using CEP and that their output intelligence is based on complex events.
CEP is a type of computing in which sets of incoming (“base”) events, sometimes hundreds or thousands of events, are distilled into higher-level and more abstract (“complex”) events. Complex events are understandable by human managers and provide insight into what is happening at the business level. Each incoming event is a record of something that happened – a business transaction, a tweet, a sensor reading, an email, a sales report, or other happening. CEP is used to correlate data about events from one or more sources on the basis of having a common value in a key field, and then to find pattern matches and trends. One complex event may be the result of calculations performed on one, two or thousands of incoming base events. CEP can provide different views of incoming events tailored to the needs of different role-players with the company such as sales, supply and financial managers.
The type of CEP that is used in depends on the situation. CEP is often used in “active,” event-driven monitoring and alerting systems that run continuously, processing event data as it arrives; detecting anomalies, threats and opportunities; triggering automated responses or sending alerts; and, in many cases, providing a dashboard that provides visibility into what is happening. Continuous intelligence systems vary tremendously:
- Basic monitoring and alerting systems are technically, although barely, using CEP. For example, an incoming simple base event from a sensor may indicate that a door has been unlocked. An alarm system compares the time of the unlocking to a predetermined time interval and then sounds a horn to report that an “after hours unlocking” complex event has occurred. Scenarios such as this use only a fraction of the principles and techniques that are used more-demanding event processing systems.
- Moderately sophisticated monitoring and alerting systems are commonly used in supply chain management, fleet operations management, real-time cross selling, most fraud detection and other applications. These systems must implement some aspects of CEP to accomplish their mission, although the CEP is limited in scope. The systems deal with a few, specific kinds of events and are designed to compute only the KPIs and other metrics that are relevant to their respective usage scenarios, (See Commercial Operational Intelligence Platforms Are Coming to Market).
- Highly demanding real-time intelligence systems often need additional CEP techniques beyond those used in moderately demanding systems. For example, these systems compute incrementally as the data arrives rather than storing the data in a database in memory or on disk before periodically re-calculating the queries. Commercial CEP platforms (sometimes called event stream processing systems) are designed from scratch to support high throughput (thousands or up to millions of events per second) and low latency (sub-millisecond response to new event data). Also see Use Complex-Event Processing to Keep Up With Real-time Big Data.
Most discussions of CEP have historically focused most of their attention on the third kind of application – the high-volume/low-latency continuous intelligence systems – although they account for only a small share of the CEP that is actually executed in business. For the most part, this is appropriate, because low-end systems don’t need explicit CEP concepts and terminology to succeed. However, an increasing number of moderately-demanding, real-time intelligence applications would benefit from a conscious use of more aspects of the CEP discipline (a related article on “Understanding CEP” is forthcoming; see also “about CEP“). Architects, developers and business analysts can design and build monitoring and alerting systems that have fewer defects, and are more efficient, extensible, flexible and reusable by leveraging concepts such as event modeling, event hierarchies, in-memory computing and explicit support for time windows in their development methodologies and software tools.