Home Contact

Event Series Intelligence: Esper & NEsper

Where Complex Event Processing meets Open Source: Esper & NEsper
High-availability and Enterprise Readiness: EsperHA, Enterprise Edition and support services. Embed mainstream CEP in your products and deployments: Learn about our licensing options.
Product info | Customers | Contact us

Event Processing with Esper and NEsper

Esper is a component for complex event processing (CEP) and event series analysis, available for Java as Esper, and for .NET as NEsper.

Esper and NEsper enable rapid development of applications that process large volumes of incoming messages or events, regardless of whether incoming messages are historical or real-time in nature. Esper and NEsper filter and analyze events in various ways, and respond to conditions of interest.

Esper and Event Processing Language (EPL) provide a highly scalable, memory-efficient, in-memory computing, SQL-standard, minimal latency, real-time streaming-capable Big Data processing engine for historical data, or medium to high-velocity data and high-variety data.

Esper Provides An Online Application for EPL Learning: Esper EPL Online

Technology Introduction

Complex event processing (CEP) delivers high-speed processing of many events across all the layers of an organization, identifying the most meaningful events within the event cloud, analyzing their impact, and taking subsequent action in real time (source:Wikipedia).

Esper offers a Domain Specific Language (DSL) for processing events. The Event Processing Language (EPL) is a declarative language for dealing with high frequency time-based event data. SQL streaming analytics is another commonly used term for this technology.

Some typical examples of applications are:

  • Business process management and automation (process monitoring, BAM, reporting exceptions, operational intelligence)
  • Finance (algorithmic trading, fraud detection, risk management)
  • Network and application monitoring (intrusion detection, SLA monitoring)
  • Sensor network applications (RFID reading, scheduling and control of fabrication lines, air traffic)

Esper Enterprise Edition Feature Summary

  • GUI for design and management of EPL statements and CEP engine in general (JavaScript and HTML 5)
  • EPL editor and debugger; Detailed breakdown of memory use and metrics for all EPL statement state (data windows, indexes, aggregations etc. memory use)
  • Real-time continuously-updating displays; jQuery plug-in, dashboard builder, JavaScript API
  • REST Web Services for CEP engine and push management
  • Full support for applications that embed Esper (does not require Enterprise Edition server)
  • Hot deployment of EPL modules and event-driven applications
  • Highly scalable, elastic, distributable and fault tolerant event processing
  • Integration with common distributed caches
  • Info and Download

EsperHA Feature Summary

  • Resiliency of CEP engine state and EPL statement state, as needed; High write performance and fast recovery
  • Info and Download

Esper Feature Summary

Data windows are for managing fine-grained event expiry. They instruct the engine how long to retain relevant events or under what conditions events can be discarded. Data windows operate on the level of individual queries, streams and subqueries.

Sliding windows: time, length, sorted, ranked, accumulating, time-ordering, externally-timed (value-based windowing), expiry-expression-based with aggregations

Tumbling windows: time, length and multi-policy; first-event; expiry-expression-based with aggregations

Combine windows with intersection and union semantics.

Partitioned windows. Dynamically shrinking or expanding windows.

For example, use a time window to keep arriving events for N seconds. The engine let events go (expires) that are older than N seconds.

Having a good variety of configurable and combinable data windows available allows you to address more analysis requirements and address common requirements concisely.

Named windows are globally visible data windows that allow sharing sets of events between queries efficiently, removing the need to keep the same events in multiple places.

Define custom criteria for entering events and for expiring events.

Esper supports fire-and-forget (on-demand) queries against named windows including joins.

Esper supports explicit indexes (hash and btree).

Esper supports update-insert-delete (aka. merge or upsert) and select-and-delete in a single atomic operation.

Allows defining event expiry once and apply it across multiple queries.

On-demand (fire-and-forget, execute-once, non-continuous) queries are useful for getting current state once and upon request.

Explicit indexes help in reusing indexes between queries, in performance and in query (statement, we use the term interchangeably) planning.

Atomic operations allow more concise EPL and can help performance.

This category is event series analysis - analyzing a series, stream or historical events.

Match-recognize is a query model for pattern matching based on regular expressions.

Some people find regular expressions easy to understand. Many patterns can be expressed concisely with match-recognize.

Patterns is pattern language that provides logical and temporal event correlation.

Timer-control is part of patterns and includes a crontab-like 'at' operator.

The lifecycle of patterns can be controlled by timer and via operators, repeat-number and repeat-until, every-distinct, while.

Patterns offer an expressive way to specify more complex time and/or correlation relationships.

Patterns, for example time-repeating patterns that trigger based on time passing, are often combined in the from-clause with other streams or used as triggers.

Grouping, aggregation, rollup, cube, sorting, filtering, transforming, merging, splitting or duplicating of event series or streams.

These typical operations on a series of events build the foundation of many analysis solutions.

A stream by itself has near-zero cost in terms of memory or CPU use.

Context declarations allow providing the context information of your situation detection. Contexts can control detection lifetime and concurrency aspects.

Context dimensions can, for example, be based on consistent-hashes, keys, categories, or overlapping and non-overlapping.

Context partitions can be initiated by event arrival and patterns and can be terminated based on a correlation, for example.

Context declarations can be nested to provide finer-grained control.

This allows framing the situation to be detected.

The engine processes context partitions concurrently allowing effective use of multiple threads and fine-grained locking.

Output rate limiting and stabilizing, snapshot output

This provides fine grained control over output frequency and content.

Event consumption is control over non-consuming (event available for further matching) and consuming (event not available for further matching) operation.

Some use cases require consumption and others don’t.

Enumeration methods execute lambda-expressions.

They are useful for analyzing a collection of values or events. They are statelessly executed. The analysis function is passed as a parameter into the enumeration method.

Date-time methods provide common date-time operations.

This helps when performing date-time arithmetic, for example.

Allan's interval algebra with support for point-in-time events and events with duration.

This helps when you want to compare events in terms of their interval and time relationships.

Declared expressions allow reusing common expressions within and across queries.

So you don't need to duplicate common expressions.

Script integration for calling external scripting language scripts right within EPL, such as JavaScript, MVEL or other JSR 223 scripts.

This allows specifying code as part of the EPL query.

Joining external data for easy integration with external data sources such as web services, for example.

Relational database access via SQL-query joins with event streams: LRU (least-recently used) and expiry-time query result caches; Keyed cache entries for fast cache lookup; Engine indexes cached rows for fast filtering within a large number of SQL-query result rows; Multiple SQL-queries in one query transparently integrates multiple autonomous database systems.

Method invocation joins.

This provides one common means for integrating relational and non-relational external data.

Variables and constants with guarantees of consistency and atomicity of variable updates within and across queries.

These can occur in any expression and can make EPL dynamically controllable and easier to maintain.

Constants ensure query optimizations for constant values are possible for variables that never change.

Event representation discusses event typing, event objects and event type relationships. .

Events can be Java objects or Map interface implementations or Object-array (Object[]) or XML documents.

Freedom of choosing the best object type(s) for your use case considering trade-offs between types, and without requiring transformation.

Freedom to use dynamic types that are not predefined classes.

Power to use existing objects when they are already available.

Esper supports event-type inheritance and polymorphism for all event types including for Map and object-array representations.

Allows modeling event type hierarchy, extension and event behavior.

Event properties can be simple, indexed, mapped or nested.

By supporting nesting as well as key-value properties and multi-value properties the event information model can be rich and more useful. Esper allows querying of deep event object graphs and XML structures, for example, including for Map and object-array event representations.

Relationship between events and related data structures such as reference data can be naturally modeled.

Esper supports dynamic typing of properties, further supported by cast, instanceof and exists functions.

Useful when, at the time of continuous query creation, it is not known whether properties will be present and what the property type may be for any given event that arrives.

Esper supports a create-schema syntax to declare event types from a column-and-type list, from existing classes or from other types, by means of templating for example, with declarative inheritance.

Types can be defined explicitly or implicitly (insert-into).

When un-deploying a module of EPL queries the engine can drop associated types that are no longer used anywhere.

Variant event-typed streams allows treating disparate types of events as the same type, such as when the event type can only be known at runtime, when the event type is expected to vary, or when optional properties are desired.

For use cases that have multiple un-reconciled event types.

Versioned events that update, provide a new version or that revise an existing event.

Allows expressing more concisely when events are actually newer versions of previous events.

EPL syntax for contained events.

Contained-event select syntax for easy handling of coarse-grained, business-level events which themselves contain events or that need to be broken down into rows.

Allows writing EPL directly against “unpacked” data.

SQL standard based

Familiar SQL-standard-based continuous query language using insert into, select, from, where, group-by, having, order-by, limit and distinct clauses.

Inner-joins and outer joins (left, right, full) of an unlimited number of streams or windows.

Sub-queries including “exists” and “in”.

Rollup and Cube with grouping set definition.

SQL provides well-defined semantics, is standardized and can help flatten the learning curve.

The design of EPL is as close as feasible to SQL and extends SQL.

Execution characteristics

Scalability in the face of large numbers of continuous queries.

Let’s say you have 10.000 queries that all read from the same input stream, check if a specific attribute (namely, price) of an event is inside a given random interval or that use equals on some event attributes.

Esper detects that many queries have a condition on the same variable(s) and creates a decision tree, thereby evaluation cost of an event is only log N and only in the worst case O(N).

Allow for high degree of parallelization processing the same query and processing multiple queries. Stateless queries process lock-free.

The Esper design can help maximize throughput under threading but protect state from concurrent modification. Esper can execute stateless and lock-free where possible.

Thereby Esper can achieve data parallelism and component parallelism.

Multithread-safe.

Create, start and stop queries during operation.

Applications can retain full control over threading; Inbound, outbound and execution threading configurable and none provided by default.

Since Esper doesn’t have a strong opinion on what threads exist and since Esper doesn’t have to queue events, it is suitable to run in any container or process.

Esper can achieve optimal performance by not requiring thread handoffs, context switches or queue synchronization.

API

Full control over the concept of time.

Supports externally-provided time as well as current system time, allowing applications full control over the concept of time within an engine and full control over which thread(s) evaluate timer schedule for queries.

This can be useful for replaying historical data.

Allows to use a more precise or accurate time then perhaps provided by the JVM.

Allows control over time passing.

Multiple independent engines per process

This is useful when you want separation but want to operate in the same JVM instance.

Add and remove queries at runtime.

New queries can be created at runtime without stopping processing.

Enable/disable continuous queries and/or partitions without losing state.

Control event visibility and the concept of time on a query level.

Esper can disable/enable any query and context partition without the need of removing and adding it again, also during runtime.

This allows loading queries from historical data and merge continuous queries to receive online event streams after historical load completed.

Push and pull

Support for both push- or subscription-based delivery to listeners/subscribers/observers as well as a pull- or receive-based for querying current results. Concurrency-safe and read-write locked for multiple readers.

Sometimes it is more convenient for an application to ask for current results then constantly receive data.

Increases performance since the engine can skip pushing output.

Mature API

Module parsing and deployment API.

Esper provides a small test framework for unit or regression testing EPL-based applications.

API maturity helps you since between releases you can expect little to no code changes and release compatibility.

Esper offers an organization of EPL into modules for convenient deployment management.

Statement (Query) Object Model

A set of classes providing an object-oriented representation of a EPL query.

Full and complete specification of a query via object model.

Round-trip from object model to query text and back to object model.

Build, change or interrogate EPL queries beyond the textual representation.

This feature can make tool development easier. It also makes otherwise opaque EPL strings useful.

Prepared queries and substitution parameters.

Precompile a query with substitution parameters and efficiently execute or start the parameterized queries multiple times, similar to JDBC prepared queries.

Reduces execution time for fire-and-forget queries and creation time for continuous queries.

JSON and XML output event rendering

Easy output formatting for common formats without custom code.

Data Flow-Type Invocation of EPL operators

For highest-performance use cases, custom flows or IO the dataflow declaration offers a lower-level access or control over EPL select and event bus operations.

Extensibility

Pluggable architecture for event pattern and event stream analysis via user-defined functions, plug-in views, plug-in aggregation functions, plug-in pattern guards and plug-in pattern event observers and event instance methods. Virtual data windows for transparently backing named windows with an external store. Applications can plug-in their own event representation and dynamic type resolution

Extending the EPL grammar allows for application-provided features to seamlessly integrate.

Input and Output Adapters

CSV input adapter reads comma-separated value formats; simulate multiple event streams with timed, coordinated playback via timestamp column; load generation; preloading of reference data

JMS input and output adapter based on Spring JMS templates

DB output adapter for running DML and for keyed update-insert (aka. upsert)

HTTP input+output adapter

Socket input adapter

See Apache Camel and other ESBs for more adapters.

JMX metrics exposure.

Mostly applications don’t need much input or output adapter code as the API makes feeding events and receiving events easy.

Examples

Numerous examples, online solution patterns page

To get started and for self-help.

Benchmark kit

The benchmark kit is a possible foundation for performing your own measurements. *Note the documentation chapter on additional performance tips that are not necessarily implemented by the benchmark.