Stream of consciousness
Complex-event processing engines can make sense of data flowing in from many sources
- By Rutrell Yasin
- Jul 12, 2007
An emerging class of software that can monitor data bombarding an organization's information technology infrastructure from multiple sources is making inroads into the government sector.
Known as complex-event processing, or CEP, the software can detect patterns in intricate situations from multiple sources, giving analysts a deeper understanding of their business processes and events.
First used to analyze trading transactions on Wall Street, CEP engines are being applied to other areas, such as intelligence and surveillance, battlefield command and control, and network monitoring, industry experts say.
That's why In-Q-Tel, the CIA's venture-capital arm and technology incubator, is interested in the technology. The company made a strategic investment in StreamBase Systems, a developer of CEP technology, in February.
'We're seeing the emergence of devices that are flooding businesses with events,' said Troy Pearsall, executive vice president of technology transfer at In-Q-Tel.
'These devices range from your standard network monitoring devices to radio frequency identification tags to sensor networks that are really upping the ante in terms of the volume of data being flowed into organizations,' he said.
Traditional databases and their emphasis on historical data don't give business users and analysts the level of analysis they need to make decisions on the fly, Pearsall said. 'Users want to take action more quickly, and that leads to the need for complex event processing. You want to make decisions as events are flowing into your organizations, not after the fact.'
In-Q-Tel chose StreamBase, a four-year-old company, because of its strong development environment, he said. 'They have a real strong graphical and text-based development environment that allows mere mortals to stand up applications very quickly.'
StreamBase also can connect to many sources of information, and the company's CEP engine can integrate historical data and real-time data to answer questions, Pearsall said.
CEP is an emerging market, and ' as in the business intelligence and standard database markets ' there will be a lot of complementary technologies coming out in the next few years, he said.
'Certainly, we're always looking for opportunities,' Pearsall said about the possibility of investing in more CEP vendors.
'There are three attributes that really define complex-event processing: high data volumes, instant response and complex analytics,' said Bill Hobbib, vice president of marketing at StreamBase. 'If you have all three together, it's a complex-event processing problem.'
The StreamBase software platform handles high-volume data. The company is working with users in the areas of intelligence and surveillance, intrusion detection, network monitoring and battlefield command and control, he said.
StreamBase runs on Microsoft Windows, Sun Microsystems Solaris and Linux platforms. 'So, it runs fast on commodity platforms,' Hobbib said.
The company released in June a new version of its software platform, StreamBase 5.0, which speeds the development and deployment of real-time applications.
The new release offers built-in support for IBM's DB2 data server, WebSphere Front Office and xSeries hardware.
Other new features include out-of-the box application development frameworks, end-to-end application development, expanded support for advanced data types, flexible pattern matching, enterprise security and integration with a broad range of market data, and messaging infrastructure systems.
Flexible pattern matching is significant, and StreamBase plans to expand more into this area, Hobbib said.
For example, with pattern matching, a network administrator might say, 'If a network event occurred, and its characteristics look like X, we're going to consider this person an intruder and kick him out of our network.' But with flexible pattern matching, network analysts can say, 'If A occurred followed by B and then C, this is a particular sequence of events or patterns, and this is really significant.' Or they can say, 'A occurred, and we think it is suspicious, but because it wasn't followed by B and C, then maybe it's not so suspicious that we need to give it the highest level of alert. It can be given a middle level,' Hobbib said.
This concept of looking for patterns in real-time events is applicable to the battlefield, intelligence and network monitoring, he said.
Another direction for the company is support for binary large objects, or BLOBS, which would aid battlefield command, control, intelligence and surveillance. This involves working with partners to extract features from video images and audio files, Hobbib said.
Other areas of expansion include the ability to process advanced data types and enterprise security tightly integrated with Lightweight Directory Access Protocol and authentication systems.Teaming for CEP
Two other companies, ANTs Software and Coral8, have teamed on a next-generation CEP environment. ANTs' high-performance SQL database management system will work in conjunction with Coral's CEP engine, company officials said. The combination of ANTs Data Server and the Coral8 Engine will boost application performance in the CEP arena, said John Morell, director of product marketing at Coral8.
'It's important to recognize that you just don't process and generate events and throw them out into the ether,' he said. 'When these high-level events happen, you need to be able to store them somewhere, and sometimes in multiple places.' Users need to be able to access data to get at all the information relevant to an event.
'No matter which way you store it, you need a high-speed storage system like the ANTs database that can keep up with all of the events that are flagged through the system,' Morell said. 'You're talking hundreds of thousands ' sometimes half a million events per second.'
'That's what we are doing for Coral8 ' the ability to do a high number of inserts per second into our database,' said Cesar Rojas, ANTs' director of marketing.
The Coral8 application rapidly presents data, but the ANTs database can easily keep up because of its ability to do as many as one million inserts per second, he said. Also, ANTs lets users quickly process data, he added.
ANTs is no stranger to the defense market. The high-performance database has been used in conjunction with the Navy's DDG 1000 Zumwalt-class destroyer project, said Patrick Moore, an ANTs vice president.
In that capacity, the product has integrated with real-time extensions to Red Hat Linux developed by IBM and IBM's service-oriented product, Real Time WebSphere, to support mission-critical ship operations, he said.Old school, new school
Another CEP engine used in government is Progress Software's Apama. One could call Apama the godfather of complex-event processing.
The Apama algorithmic engine was introduced in 1999 by two researchers from Cambridge University with some assistance from developers at Stanford University and the California Institute of Technology, said Mark Palmer, vice president and general manager at Progress' Apama division.
Apama was one of three vendors that sold complex processing engines at the time, he said. The company and product were acquired by Progress Software in April 2005.
Apama's origins are in the capital trading market, but government is a growing sector.
'We've worked in government applications with radar feeds of intelligence information that come from surveillance planes,' he said. That information is transmitted back to a central, event-driven architecture environment that analyzes it and distributes it to appropriate analysts for defense purposes.
The Progress Apama Event Processing Platform is a complete environment for creating applications that monitor event streams, detect event patterns and take action. The platform includes Business Activity Monitoring capabilities and a sophisticated CEP language, company officials said.
Truviso, formerly Amalgamated Insight, a new player in the CEP arena, uses standard SQL language for querying, multiple heterogeneous data streams, said Mike Trigg, co-founder and executive vice president of marketing and business development at Truviso.Building on experience
'Compared to some of the earlier products in this space, we think taking a standard approach using SQL ' a language with 30-plus years of proven work behind it in the relational database world ' is an important part of what we are doing differently,' he said.
Three main things differentiate the company's CEP product from others, he said. The core engine has an adaptive query processor, which lets the product support a wide set of SQL queries, such as user-defined functions, user-defined aggregates and subqueries. Some vendors that have taken an SQL approach have not included that capability, Trigg said.
Adaptive query process also allows the product to run thousands of concurrent queries against incoming streams of data. This lets Truviso perform sophisticated analytics on the fly, he said.
Another critical function is visualization capabilities. The product has a full Web-based user dashboard, which is critical for helping users understand what's going on with the data they are analyzing, Trigg said.
The third piece is a full database embedded within the product, not just hooked on the side. The company took the PostgresSQL open-source database and added Truviso's extreme processing capabilities to the Postgres engine to give users streaming and more traditional relational database capabilities. In the same engine, they can have queries run over streams and tables, and they can do caching and archiving, Trigg said.
'In the real-world use cases that we see, you're not just analyzing incoming streams, but you want to compare those streams to aggregates, historical information, trends and averages,' he said.