3.0.0
Copyright © 2009 EsperTech Inc.
Table of Contents
Analyzing and reacting to information in real-time oftentimes requires the development of custom applications. Typically these applications must obtain the data to analyze, filter data, derive information and then indicate this information through some form of presentation or communication. Data may arrive with high frequency requiring high throughput processing. And applications may need to be flexible and react to changes in requirements while the data is processed. Esper is an event stream processor that aims to enable a short development cycle from inception to production for these types of applications.
This document is a resource for software developers who develop event driven applications. It also contains information that is useful for business analysts and system architects who are evaluating Esper.
It is assumed that the reader is familiar with the Java programming language.
This document is relevant in all phases of your software development project: from design to deployment and support.
If you are new to Esper, please follow these steps:
Read the tutorials, case studies and solution patterns available on the Esper public web site at http://esper.codehaus.org
Read Section 1.1, “Introduction to CEP and event stream analysis” if you are new to CEP and ESP (complex event processing, event stream processing)
Read Chapter 2, Event Representations that explains the different ways of representing events to Esper
Read Chapter 3, Processing Model to gain insight into EPL continuous query results
Read Section 4.1, “EPL Introduction” for an introduction to event stream processing via EPL
Read Section 5.1, “Event Pattern Overview” for an overview over event patterns
Then glance over the examples Section 12.1, “Examples Overview”
Finally to test drive Esper performance, read Chapter 13, Performance
The Esper engine has been developed to address the requirements of applications that analyze and react to events. Some typical examples of applications are:
Business process management and automation (process monitoring, BAM, reporting exceptions)
Finance (algorithmic trading, fraud detection, risk management)
Network and application monitoring (intrusion detection, SLA monitoring)
Sensor network applications (RFID reading, scheduling and control of fabrication lines, air traffic)
What these applications have in common is the requirement to process events (or messages) in real-time or near real-time. This is sometimes referred to as complex event processing (CEP) and event stream analysis. Key considerations for these types of applications are throughput, latency and the complexity of the logic required.
High throughput - applications that process large volumes of messages (between 1,000 to 100k messages per second)
Low latency - applications that react in real-time to conditions that occur (from a few milliseconds to a few seconds)
Complex computations - applications that detect patterns among events (event correlation), filter events, aggregate time or length windows of events, join event streams, trigger based on absence of events etc.
The Esper engine was designed to make it easier to build and extend CEP applications.
Relational databases and the standard query language (SQL) are designed for applications in which most data is fairly static and complex queries are less frequent. Also, most databases store all data on disks (except for in-memory databases) and are therefore optimized for disk access.
To retrieve data from a database an application must issue a query. If an application need the data 10 times per second it must fire the query 10 times per second. This does not scale well to hundreds or thousands of queries per second.
Database triggers can be used to fire in response to database update events. However database triggers tend to be slow and often cannot easily perform complex condition checking and implement logic to react.
In-memory databases may be better suited to CEP applications then traditional relational database as they generally have good query performance. Yet they are not optimized to provide immediate, real-time query results required for CEP and event stream analysis.
The Esper engine works a bit like a database turned upside-down. Instead of storing the data and running queries against stored data, the Esper engine allows applications to store queries and run the data through. Response from the Esper engine is real-time when conditions occur that match queries. The execution model is thus continuous rather then only when a query is submitted.
Esper provides two principal methods or mechanisms to process events: event patterns and event stream queries.
Esper offers an event pattern language to specify expression-based event pattern matching. Underlying the pattern matching engine is a state machine implementation. This method of event processing matches expected sequences of presence or absence of events or combinations of events. It includes time-based correlation of events.
Esper also offers event stream queries that address the event stream analysis requirements of CEP applications. Event stream queries provide the windows, aggregation, joining and analysis functions for use with streams of events. These queries are following the EPL syntax. EPL has been designed for similarity with the SQL query language but differs from SQL in its use of views rather then tables. Views represent the different operations needed to structure data in an event stream and to derive data from an event stream.
Esper provides these two methods as alternatives through the same API.
Esper requires the following 3rd-party libraries at runtime:
ANTLR is the parser generator used for parsing and parse tree walking of the pattern and EPL syntax. Credit goes to Terence Parr at http://www.antlr.org. The ANTLR license is in the lib directory. The library is required for compile-time only.
CGLIB is the code generation library for fast method calls. This open source software is under the Apache license. The Apache 2.0 license is in the lib directory.
LOG4J and Apache commons logging are logging components. This open source software is under the Apache license. The Apache 2.0 license is in the lib directory.
Esper requires the following 3rd-party libraries at compile-time and for running the test suite:
JUnit is a great unit testing framework. Its license has also been placed in the lib directory. The library is required for build-time only.
MySQL connector library is used for testing SQL integration and is required for running the automated test suite.
This section outlines the different means to model and represent events.
Please see the Section 9.5, “Event and Event Type” section for APIs.
An event is an immutable record of a past occurrence of an action or state change. Event properties capture the state information for an event.
In Esper, an event can be represented by any of the following underlying Java objects:
Table 2.1. Event Underlying Java Objects
| Java Class | Description |
|---|---|
| java.lang.Object | Any Java POJO (plain-old java object) with getter methods following JavaBean conventions; Legacy Java classes not following JavaBean conventions can also serve as events . |
| java.util.Map | Map events are key-values pairs and can also contain objects, further Map, and arrays thereof. |
| org.w3c.dom.Node | XML document object model (DOM). |
| org.apache.axiom.om.OMDocument or OMElement | XML - Streaming API for XML (StAX) - Apache Axiom (provided by EsperIO package). |
| Application classes | Plug-in event representation via the extension API. |
Esper provides multiple choices for representing an event. There is no absolute need for you to create new Java classes to represent an event.
Event representations have the following in common:
All event representations support nested, indexed and mapped properties (aka. property expression), as explained in more detail below. There is no limitation to the nesting level.
All event representations provide event type metadata. This includes type metadata for nested properties.
All event representations allow transposing the event itself and parts of all of its property graph into new events. The term transposing refers to selecting the event itself or event properties that are themselves nestable property graphs, and then querying the event's properties or nested property graphs in further statements. The Apache Axiom event representation is an exception and does not currently allow transposing event properties but does allow transposing the event itself.
The Java object and Map representations allow supertypes.
The API behavior for all event representations is the same, with minor exceptions noted in this chapter.
The benefits of multiple event representations are:
For applications that already have events in one of the supported representations, there is no need to transform events into a Java object before processing.
Event representations are exchangeable, reducing or eliminating the need to change statements when the event representation changes.
Event representations are interoperable, allowing all event representations to interoperate in same or different statements.
The choice makes its possible to consciously trade-off performance, ease-of-use, the ability to evolve and effort needed to import or externalize events and use existing event type metadata.
Event properties capture the state information for an event. Event properties be simple as well as indexed, mapped and nested event properties. The table below outlines the different types of properties and their syntax in an event expression. This syntax allows statements to query deep JavaBean objects graphs, XML structures and Map events.
Table 2.2. Types of Event Properties
| Type | Description | Syntax | Example |
|---|---|---|---|
| Simple | A property that has a single value that may be retrieved. | name | sensorId |
| Indexed | An indexed property stores an ordered collection of objects (all of the same type) that can be individually accessed by an integer-valued, non-negative index (or subscript). | name[index] | sensor[0] |
| Mapped | A mapped property stores a keyed collection of objects (all of the same type). | name('key') | sensor('light') |
| Nested | A nested property is a property that lives within another property of an event. | name.nestedname | sensor.value |
Combinations are also possible. For example, a valid combination could be person.address('home').street[0].
If your application uses java.util.Map or XML to represent events, then event property names may themselves contain the dot ('.') character. The backslash ('\') character can be used to escape dot characters in property names, allowing a property name to contain dot characters.
For example, the EPL as shown below expects a property by name part1.part2 to exist on event type MyEvent:
select part1\.part2 from MyEvent
Sometimes your event properties may overlap with EPL language keywords. In this case you may use the backwards apostrophe ` character to escape the property name.
The next example assumes a Quote event that has a property by name order, while order is also a reserved keyword:
select `order` from Quote
Dynamic (unchecked) properties are event properties that need not be known at statement compilation time. Such properties are resolved during runtime: they provide duck typing functionality.
The idea behind dynamic properties is that for a given underlying event representation we don't always know all properties in advance. An underlying event may have additional properties that are not known at statement compilation time, that we want to query on. The concept is especially useful for events that represent rich, object-oriented domain models.
The syntax of dynamic properties consists of the property name and a question mark. Indexed, mapped and nested properties can also be dynamic properties:
Table 2.3. Types of Event Properties
| Type | Syntax |
|---|---|
| Dynamic Simple | name? |
| Dynamic Indexed | name[index]? |
| Dynamic Mapped | name('key')? |
| Dynamic Nested | name?.nestedPropertyName |
Dynamic properties always return the java.lang.Object type. Also, dynamic properties return a null value if the dynamic property does not exist on events processed at runtime.
As an example, consider an OrderEvent event that provides an "item" property. The "item" property is of type Object and holds a reference to an instance of either a Service or Product.
Assume that both Service and Product classes provide a property named "price". Via a dynamic property we can specify a query that obtains the price property from either object (Service or Product):
select item.price? from OrderEvent
As a second example, assume that the Service class contains a "serviceName" property that the Product class does not possess. The following query returns the value of the "serviceName" property for Service objects. It returns a null-value for Product objects that do not have the "serviceName" property:
select item.serviceName? from OrderEvent
Consider the case where OrderEvent has multiple implementation classes, some of which have a "timestamp" property. The next query returns the timestamp property of those implementations of the OrderEvent interface that feature the property:
select timestamp? from OrderEvent
The query as above returns a single column named "timestamp?" of type Object.
When dynamic properties are nested, then all properties under the dynamic property are also considered dynamic properties. In the below example the query asks for the "direction" property of the object returned by the "detail" dynamic property:
select detail?.direction from OrderEvent // equivalent to select detail?.direction? from OrderEvent
The functions that are often useful in conjunction with dynamic properties are:
The cast function casts the value of a dynamic property (or the value of an expression) to a given type.
The exists function checks whether a dynamic property exists. It returns true if the event has a property of that name, or false if the property does not exist on that event.
The instanceof function checks whether the value of a dynamic property (or the value of an expression) is of any of the given types.
Dynamic event properties work with all event representations outlined next: Java objects, Map-based and XML DOM-based events.
Sometimes an event can have properties that are itself events. Esper uses the term fragment and fragment type for such event pieces. The best example is a pattern that matches two or more events and the output event contains the matching events as fragments. In other words, output events can be a composite event that consists of further events, the fragments.
Fragments have the same metadata available as their enclosing composite events. The metadata for enclosing composite events contains information about which properties are fragments, or have a property value that can be represented as a fragment and therefore as an event itself.
Fragments and type metadata can allow your application to navigate composite events without the need for using the Java reflection API and reducing the coupling to the underlying event representation. The API is further described in Section 9.5, “Event and Event Type”.
Plain-old Java object events are object instances that expose event properties through JavaBeans-style getter methods. Events classes or interfaces do not have to be fully compliant to the JavaBean specification; however for the Esper engine to obtain event properties, the required JavaBean getter methods must be present or an accessor-style and accessor-methods may be defined via configuration.
Esper supports JavaBeans-style event classes that extend a superclass or implement one or more interfaces. Also, Esper event pattern and EPL statements can refer to Java interface classes and abstract classes.
Classes that represent events should be made immutable. As events are recordings of a state change or action that occurred in the past, the relevant event properties should not be changeable. However this is not a hard requirement and the Esper engine accepts events that are mutable as well.
The hashCode and equals methods do not need to be implemented. The implementation of these methods by a Java event class does not affect the behavior of the engine in any way.
Please see Chapter 10, Configuration on options for naming event types represented by Java object event classes. Java classes that do not follow JavaBean conventions, such as legacy Java classes that expose public fields, or methods not following naming conventions, require additional configuration. Via configuration it is also possible to control case sensitivity in property name resolution. The relevant section in the chapter on configuration is Section 10.4.1.3, “Non-JavaBean and Legacy Java Event Classes”.
As outlined earlier, the different property types are supported by the standard JavaBeans specification, and some of which are uniquely supported by Esper:
Simple properties have a single value that may be retrieved. The underlying property type might be a Java language primitive (such as int, a simple object (such as a java.lang.String), or a more complex object whose class is defined either by the Java language, by the application, or by a class library included with the application.
Indexed - An indexed property stores an ordered collection of objects (all of the same type) that can be individually accessed by an integer-valued, non-negative index (or subscript).
Mapped - As an extension to standard JavaBeans APIs, Esper considers any property that accepts a String-valued key a mapped property.
Nested - A nested property is a property that lives within another Java object which itself is a property of an event.
Assume there is an NewEmployeeEvent event class as shown below. The mapped and indexed properties in this example return Java objects but could also return Java language primitive types (such as int or String). The Address object and Employee can themselves have properties that are nested within them, such as a street name in the Address object or a name of the employee in the Employee object.
public class NewEmployeeEvent {
public String getFirstName();
public Address getAddress(String type);
public Employee getSubordinate(int index);
public Employee[] getAllSubordinates();
}
Simple event properties require a getter-method that returns the property value. In this example, the getFirstName getter method returns the firstName event property of type String.
Indexed event properties require either one of the following getter-methods. A method that takes an integer-type key value and returns the property value, such as the getSubordinate method, or a method that returns an array-type, or a class that implements Iterable. An example is the getSubordinates getter method, which returns an array of Employee but could also return an Iterable. In an EPL or event pattern statement, indexed properties are accessed via the property[index] syntax.
Mapped event properties require a getter-method that takes a String-typed key value and returns the property value, such as the getAddress method. In an EPL or event pattern statement, mapped properties are accessed via the property('key') syntax.
Nested event properties require a getter-method that returns the nesting object. The getAddress and getSubordinate methods are mapped and indexed properties that return a nesting object. In an EPL or event pattern statement, nested properties are accessed via the property.nestedProperty syntax.
All event pattern and EPL statements allow the use of indexed, mapped and nested properties (or a combination of these) anywhere where one or more event property names are expected. The below example shows different combinations of indexed, mapped and nested properties in filters of event pattern expressions:
every NewEmployeeEvent(firstName='myName')
every NewEmployeeEvent(address('home').streetName='Park Avenue')
every NewEmployeeEvent(subordinate[0].name='anotherName')
every NewEmployeeEvent(allSubordinates[1].name='thatName')
every NewEmployeeEvent(subordinate[0].address('home').streetName='Water Street')Similarly, the syntax can be used in EPL statements in all places where an event property name is expected, such as in select lists, where-clauses or join criteria.
select firstName, address('work'), subordinate[0].name, subordinate[1].name
from NewEmployeeEvent
where address('work').streetName = 'Park Ave'Property names follows Java standards: the class java.beans.Introspector and method getBeanInfo returns the property names as derived from the name of getter methods. In addition, Esper configuration provides a flag to turn off case-sensitive property names. A sample list of getter methods and property names is:
Table 2.4. JavaBeans-style Getter Methods and Property Names
| Method | Property Name | Example |
|---|---|---|
| getPrice() | price | select price from MyEvent |
| getNAME() | NAME | select NAME from MyEvent |
| getItemDesc() | itemDesc | select itemDesc from MyEvent |
| getQ() | q | select q from MyEvent |
| getQN() | QN | select QN from MyEvent |
| getqn() | qn | select qn from MyEvent |
| gets() | s | select s from MyEvent |
Constants are public static final fields in Java that may also participate in expressions of all kinds, as this example shows:
select * from MyEvent where property=MyConstantClass.FIELD_VALUE
Event properties that are enumeration values can be compared by their enumeration value:
select * from MyEvent where enumProp=EnumClass.ENUM_VALUE_1
Alternatively, a static method may be employed on a class, such as the enumeration class 'EnumClass' as below:
select * from MyEvent where enumProp=EnumClass.valueOf('ENUM_VALUE_1')If your application does not import, through configuration, the package that contains the enumeration class, then it must also specify the package name of the class. Enumeration classes that are inner classes must be qualified with $ following Java conventions.
For example, the Color enumeration as an inner class to MyEvent in package org.myorg can be referenced as shown:
select * from MyEvent(enumProp=org.myorg.MyEvent$Color.GREEN).std:firstevent()
Instance methods may also be invoked on event instances by specifying a stream name, as shown below:
select myevent.computeSomething() as result from MyEvent as myevent
When your getter methods or accessor fields return a parameterized type, for example Iterable<MyEventData> for an indexed property or Map<String, MyEventData> for a mapped property, then property expressions may refer to the properties available through the class that is the type parameter.
An example event that has properties that are parameterized types is:
public class NewEmployeeEvent {
public String getName();
public Iterable<EducationHistory> getEducation();
public Map<String, Address> getAddresses();
}A sample of valid property expressions for this event is shown next:
select name, education, education[0].date, addresses('home').street
from NewEmployeeEventA given Map event type can have one or more supertypes that must also be Map event types. All properties available on any of the Map supertypes are available on the type itself. In addition, anywhere within EPL that an event type name of a Map supertype is used, any of its Map subtypes and their subtypes match that expression.
Your application can add properties to an existing Map event type during runtime using the configuration operation updateMapEventType. Properties may not be updated or deleted - properties can only be added, and nested properties can be added as well. The method accepts a Map structure that may include previously listed properties or can also contain only the newly added fields.
The engine can process java.util.Map events via the sendEvent(Map map, String eventTypeName) method on the EPRuntime interface. Entries in the Map represent event properties. Keys must be of type java.util.String for the engine to be able to look up event property names specified by pattern or EPL statements.
Events can also be represented by objects that implement the java.util.Map interface. Event properties of Map events are the values in the map accessible through the get method exposed by the java.util.Map interface.
The Map event type is a comprehensive type system that can eliminate the need to use Java classes as event types, thereby making it easier to change types at runtime or generate type information from another source.
A given Map event type can have one or more supertypes that must also be Map event types. All properties available on any of the Map supertypes are available on the type itself. In addition, anywhere within EPL that an event type name of a Map supertype is used, any of its Map subtypes and their subtypes match that expression.
Your application can add properties to an existing Map event type during runtime using the configuration operation updateMapEventType. Properties may not be updated or deleted - properties can only be added, and nested properties can be added as well. The runtime configuration also allows removing Map event types and adding them back with new type information.
After your application configures a Map event type by providing a type name, the type name can be used when defining further Map event types by specifying the type name as a property type or an array property type.
One-to-Many relationships in Map event types are represented via arrays. A property in a Map event type may be an array of primitive, an array of Java object or an array of Map.
The engine can process java.util.Map events via the sendEvent(Map map, String eventTypeName) method on the EPRuntime interface. Entries in the Map represent event properties. Keys must be of type java.util.String for the engine to be able to look up event property names specified by pattern or EPL statements.
Map event properties can be of any type. Map event properties that are Java application objects or that are of type java.util.Map (or arrays thereof) offer additional power:
Properties that are Java application objects can be queried via the nested, indexed, mapped and dynamic property syntax as outlined earlier.
Properties that are of type Map allow Maps to be nested arbitrarily deep and thus can be used to represent complex domain information. The nested, indexed, mapped and dynamic property syntax can be used to query Maps within Maps and arrays of Maps within Maps.
In order to use Map events, the event type name and property names and types must be made known to the engine via Configuration. Please see the examples in Section 10.4.2, “Events represented by java.util.Map”.
The code snippet below creates and processes a Map event. It defines a CarLocationUpdateEvent event type first:
Map event = new HashMap();
event.put("carId", carId);
event.put("direction", direction);
epRuntime.sendEvent(event, "CarLocUpdateEvent");The CarLocUpdateEvent can now be used in a statement:
select carId from CarLocUpdateEvent.win:time(1 min) where direction = 1
The engine can also query Java objects as values in a Map event via the nested property syntax. Thus Map events can be used to aggregate multiple data structures into a single event and query the composite information in a convenient way. The example below demonstrates a Map event with a transaction and an account object.
Map event = new HashMap();
event.put("txn", txn);
event.put("account", account);
epRuntime.sendEvent(event, "TxnEvent");An example statement could look as follows.
select account.id, account.rate * txn.amount from TxnEvent.win:time(60 sec) group by account.id
Your Map event type may declare one or more supertypes when configuring the type at engine initialization time or at runtime through the administrative interface.
Supertypes of a Map event type must also be Map event types. All property names and types of a supertype are also available on a subtype and override such same-name properties of the subtype. In addition, anywhere within EPL that an event type name of a Map supertype is used, any of its Map subtypes also matches that expression (similar to the concept of interface in Java).
This example assumes that the BaseUpdate event type has been declared and acts as a supertype to the AccountUpdate event type (both Map event types):
epService.getEPAdministrator().getConfiguration().
addEventType("AccountUpdate", accountUpdateDef,
new String[] {"BaseUpdate"});Your application EPL statements may select BaseUpdate events and receive both BaseUpdate and AccountUpdate events, as well as any other subtypes of BaseUpdate and their subtypes.
// Receive BaseUpdate and any subtypes including subtypes of subtypes select * from BaseUpdate
Your application Map event type may have multiple supertypes. The multiple inheritance hierarchy between Maps can be arbitrarily deep, however cyclic dependencies are not allowed. If using runtime configuration, supertypes must exist before a subtype to a supertype can be added.
See Section 10.4.2, “Events represented by java.util.Map” for more information on configuring Map event types.
Strongly-typed nested Map-within-Map events can be used to build rich, type-safe event types on the fly. Use the addEventType method on Configuration or ConfigurationOperations for initialization-time and runtime-time type definition.
Noteworthy points are:
JavaBean (POJO) objects can appear as properties in Map-within-Map.
One may represent Map-within-Map and Map-Array within Map using the name of a previously registered Map event type.
There is no limit to the number of nesting levels.
Dynamic properties can be used to query Map-within-Map keys that may not be known in advance.
The engine returns a null value for properties for which the access path into the nested structure cannot be followed where map entries do not exist.
For demonstration, in this example our top-level event type is an AccountUpdate event, which has an UpdatedField structure as a property. Inside the UpdatedField structure the example defines various fields, as well as a property by name 'history' that holds a JavaBean class UpdateHistory to represent the update history for the account. The code snippet to define the event type is thus:
Map<String, Object> updatedFieldDef = new HashMap<String, Object>();
updatedFieldDef.put("name", String.class);
updatedFieldDef.put("addressLine1", String.class);
updatedFieldDef.put("history", UpdateHistory.class);
Map<String, Object> accountUpdateDef = new HashMap<String, Object>();
accountUpdateDef.put("accountId", long.class);
accountUpdateDef.put("fields", updatedFieldDef);
epService.getEPAdministrator().getConfiguration().
addEventType("AccountUpdate", accountUpdateDef);The next code snippet populates a sample event and sends the event into the engine:
Map<String, Object> updatedField = new HashMap<String, Object>();
updatedField.put("name", "Joe Doe");
updatedField.put("addressLine1", "40 Popular Street");
updatedField.put("history", new UpdateHistory());
Map<String, Object> accountUpdate = new HashMap<String, Object>();
accountUpdate.put("accountId", 10009901);
accountUpdate.put("fields", updatedField);
epService.getEPRuntime().sendEvent(accountUpdate, "AccountUpdate");Last, a sample query to interrogate AccountUpdate events is as follows:
select accountId, fields.name, fields.addressLine1, fields.history.lastUpdate from AccountUpdate
Note that type information for nested maps is only available to the immediately selecting stream. For example, the second select-query does not work:
insert into MyStream select fields from NestedMapEvent // this does not work ... instead select the individual fields in the insert-into statement select fields.name from MyStream
Your application may declare a Map event type for reuse within other Map event types or for one-to-many properties represented by an array of Maps.
This example declares a Map event type by name AmountCurrency with amount and currency properties:
Map<String, Object> amountAndCurr = new HashMap<String, Object>();
amountAndCurr.put("amount", double.class);
amountAndCurr.put("currency", String.class);
epService.getEPAdministrator().getConfiguration().
addEventType("AmountCurrency", amountAndCurr);The AmountCurrency type is now available for use as a property type itself. Below code snippet declares OrderItem to hold an item number and AmountCurrency:
Map<String, Object> orderItem = new HashMap<String, Object>();
orderItem.put("itemNum", int.class);
orderItem.put("price", "AmountCurrency"); // The property type is the name itself
epService.getEPAdministrator().getConfiguration().
addEventType("OrderItem", orderItem);To model repeated properties within a Map, you may use arrays as properties in a Map. You may use an array of primitive types or an array of JavaBean objects or an array of a previously declared Map event type.
When using a previously declared Map event type as an array property, the literal [] must be appended after the event type name.
This following example defines a Map event type by name Sale to hold array properties of the various types. It assumes a SalesPerson Java class exists and a Map event type by name OrderItem was declared:
Map<String, Object> sale = new HashMap<String, Object>();
sale.put("userids", int[].class);
sale.put("salesPersons", SalesPerson[].class);
sale.put("items", "OrderItem[]"); // The property type is the name itself appended by []
epService.getEPAdministrator().getConfiguration().
addEventType("SaleEvent", sale);The three properties that the above example declares are:
An integer array of user ids.
An array of SalesPerson Java objects.
An array of Maps for order items.
The next EPL statement is a sample query asking for property values held by arrays:
select userids[0], salesPersons[1].name,
items[1], items[1].price.amount from SaleEventEvents can be represented as org.w3c.dom.Node instances and send into the engine via the sendEvent method on EPRuntime or via EventSender. Please note that configuration is required so the event type name and root element name is known. See Chapter 10, Configuration.
If a XML schema document (XSD file) can be made available as part of the configuration, then Esper can read the schema and appropriately present event type metadata and validate statements that use the event type and its properties. See Section 2.7.1, “Schema-Provided XML Events”.
When no XML schema document is provided, XML events can still be queried, however the return type and return values of property expressions are string-only and no event type metadata is available other then for explicitly configured properties. See Section 2.7.2, “No-Schema-Provided XML Events”.
In all cases Esper allows you to configure explicit XPath expressions as event properties. You can specify arbitrary XPath functions or expressions and provide a property name and type by which result values will be available for use in EPL statements. See Section 2.7.3, “Explicitly-Configured Properties”.
Nested, mapped and indexed event properties are also supported in expressions against org.w3c.dom.Node events. Thus XML trees can conveniently be interrogated via the property expression syntax.
This section uses the following XML document as an example:
<?xml version="1.0" encoding="UTF-8"?>
<Sensor xmlns="SensorSchema">
<ID>urn:epc:1:4.16.36</ID>
<Observation Command="READ_PALLET_TAGS_ONLY">
<ID>00000001</ID>
<Tag>
<ID>urn:epc:1:2.24.400</ID>
</Tag>
<Tag>
<ID>urn:epc:1:2.24.401</ID>
</Tag>
</Observation>
</Sensor>The schema for the example is:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Sensor">
<xs:complexType>
<xs:sequence>
<xs:element name="ID" type="xs:string"/>
<xs:element ref="Observation" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Observation">
<xs:complexType>
<xs:sequence>
<xs:element name="ID" type="xs:string"/>
<xs:element ref="Tag" maxOccurs="unbounded" />
</xs:sequence>
<xs:attribute name="Command" type="xs:string" use="required" />
</xs:complexType>
</xs:element>
<xs:element name="Tag">
<xs:complexType>
<xs:sequence>
<xs:element name="ID" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>If you have a XSD schema document available for your XML events, Esper can interrogate the schema. The benefits are:
New EPL statements that refer to event properties are validated against the types provided in the schema.
Event type metadata becomes available for retrieval as part of the EventType interface.
The engine reads a XSD schema file from an URL you provide. Make sure files imported by the XSD schema file can also be resolved.
The configuration accepts a schema URL. This is a sample code snippet to determine a schema URL from a file in classpath:
URL schemaURL = this.getClass().getClassLoader().getResource("sensor.xsd");Here is a sample use of the runtime configuration API, please see Chapter 10, Configuration for further examples.
epService = EPServiceProviderManager.getDefaultProvider();
ConfigurationEventTypeXMLDOM sensorcfg = new ConfigurationEventTypeXMLDOM();
sensorcfg.setRootElementName("Sensor");
sensorcfg.setSchemaResource(schemaURL.toString());
epService.getEPAdministrator().getConfiguration()
.addEventType("SensorEvent", sensorcfg);You must provide a root element name. This name is used to look up the event type for the sendEvent(org.w3c.Node node) method. An EventSender is a useful alternative method for sending events if the type lookup based on the root or document element name is not desired.
After adding the event type, you may create statements and send events. Next is a sample statement:
select ID, Observation.Command, Observation.ID, Observation.Tag[0].ID, Observation.Tag[1].ID from SensorEvent
As you can see from the example above, property expressions can query property values held in the XML document's elements and attributes.
There are multiple ways to obtain a XML DOM document instance from a XML string. The next code snippet shows how to obtain a XML DOM org.w3c.Document instance:
InputSource source = new InputSource(new StringReader(xml)); DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); builderFactory.setNamespaceAware(true); Document doc = builderFactory.newDocumentBuilder().parse(source);
Send the org.w3c.Node or Document object into the engine for processing:
epService.getEPRuntime().sendEvent(doc);
By default, property expressions such as Observation.Tag[0].ID are evaluated by a fast DOM-walker implementation provided by Esper. This DOM-walker implementation is not namespace-aware.
Should you require namespace-aware traversal of the DOM document, you must set the xpath-property-expr configuration option to true (default is false). This flag causes Esper to generate namespace-aware XPath expressions from each property expression instead of the DOM-walker, as described next. Setting the xpath-property-expr option to true requires that you also configure namespace prefixes as described below.
When matching up the property names with the XSD schema information, the engine determines whether the attribute or element provides values. The algorithm checks attribute names first followed by element names. It takes the first match to the specified property name.
By setting the xpath-property-expr option the engine rewrites each property expression as an XPath expression, effectively handing the evaluation over to the underlying XPath implementation available from classpath. Most JVM have a built-in XPath implementation and there are also optimized, fast implementations such as Jaxen that can be used as well.
Set the xpath-property-expr option if you need namespace-aware document traversal, such as when your schema mixes several namespaces and element names are overlapping.
The below table samples several property expressions and the XPath expression generated for each, without namespace prefixes to keep the example simple:
Table 2.5. Property Expression to XPath Expression
| Property Expression | Equivalent XPath |
|---|---|
| Observeration.ID | /Sensor/Observation/ID |
| Observeration.Command | /Sensor/Observation/@Command |
| Observeration.Tag[0].ID | /Sensor/Observation/Tag[position() = 1]/ID |
For mapped properties that are specified via the syntax name('key'), the algorithm looks for an attribute by name id and generates a XPath expression as mapped[@id='key'].
Finally, here is an example that includes all different types of properties and their XPath expression equivalent in one property expression:
select nested.mapped('key').indexed[1].attribute from MyEventThe equivalent XPath expression follows, this time including n0 as a sample namespace prefix:
/n0:rootelement/n0:nested/n0:mapped[@id='key']/n0:indexed[position() = 2]/@attribute
All elements that are unbound or have max occurs greater then 1 in the XSD schema are represented as indexed properties and require an index for resolution.
For example, the following is not a valid property expression in the sample Sensor document: Observeration.Tag.ID. As no index is provided for Tag, the property expression is not valid.
Repeated elements within a parent element in which the repeated element is a simple type also are represented as an array.
Consider the next XML document:
<item>
<book sku="8800090">
<author>Isaac Asimov</author>
<author>Robert A Heinlein</author>
</book>
</item>Here, the result of the expression book.author is an array of type String and the result of book.author[0] is a String value.
Dynamic properties are not validated against the XSD schema information and their result value is always org.w3c.Node. You may use a user-defined function to process dynamic properties returning Node. As an alternative consider using an explicit property.
An example dynamic property is Origin?.ID which will look for an element by name Origin that contains an element or attribute node by name LocationCode:
select Origin?.LocationCode from SensorEvent
When providing a XSD document, the default configuration allows to transpose property values that are themselves complex elements, as defined in the XSD schema, into a new stream. This behavior can be controlled via the flag auto-fragment.
For example, consider the next query:
insert into ObservationStream select ID, Observation from SensorEvent
The Observation as a property of the SensorEvent gets itself inserted into a new stream by name ObservationStream. The ObservationStream thus consists of a string-typed ID property and a complex-typed property named Observation, as described in the schema.
A further statement can use this stream to query:
select Observation.Command, Observation.Tag[0].ID from ObservationStream
Before continuing the discussion, here is an alternative syntax using the wildcard-select, that is also useful:
insert into TagListStream select ID as sensorId, Observation.* from SensorEvent
The new TagListStream has a string-typed ID and Command property as well as an array of Tag properties that are complex types themselves as defined in the schema.
Next is a sample statement to query the new stream:
select sensorId, Command, Tag[0].ID from TagListStream
Please note the following limitations:
The XPath standard prescribes that XPath expressions against org.w3c.Node are evaluated against the owner document of the Node. Therefore XPath is not relative to the current node but absolute against each node's owner document. Since Esper does not create new document instances for transposed nodes, transposing properties is not possible when the xpath-property-expr flag is set.
Complex elements that have both simple element values and complex child elements are not transposed. This is to ensure their property value is not hidden. Use an explicit XPath expression to transpose such properties.
Esper automatically registers a new event type for transposed properties. It generates the type name of the new XML event type from the XML event type name and the property names used in the expression. The synposis is type_name.property_name[.property_name...]. The type name can be looked up, for example for use with EventSender or can be created in advance.
An EventSender sends events into the engine for a given type, saving a type lookup based on element name.
This brief example sends an event via EventSender:
EventSender sender = epRuntime.getEventSender("SensorEvent");
sender.sendEvent(node);The XML DOM event sender checks the root element name before processing the event. Use the event-sender-validates-root setting to disable validation. This forces the engine to process XML documents according to any predefined type without validation of the root element name.
Without a schema document a XML event may still be queried. However there are important differences in the metadata available without a schema document and therefore the property expression results. These differences are outlined below.
All property expressions against a XML type without schema are assumed valid. There is no validation of the property expression other then syntax validation. At runtime, property expressions return string-type values or null if the expression did not yield a matching element or attribute result.
When asked for property names or property metadata, a no-schema type returns empty array.
In all other aspects the type behaves the same as the schema-provided type described earlier.
Regardless of whether or not you provide a XSD schema for the XML event type, you can always fall back to configuring explicit properties that are backed by XPath expressions.
For further documentation on XPath, please consult the XPath standard or other online material. Consider using Jaxen or Apache Axiom, for example, to provide faster XPath evaluation then your Java VM built-in XPath provider may offer.
Shown below is an example configuration that adds an explicit property backed by a XPath expression and that defines namespace prefixes:
epService = EPServiceProviderManager.getDefaultProvider();
ConfigurationEventTypeXMLDOM sensorcfg = new ConfigurationEventTypeXMLDOM();
sensorcfg.addXPathProperty("countTags", "count(/ss:Sensor/ss:Observation/ss:Tag)",
XPathConstants.NUMBER);
sensorcfg.addNamespacePrefix("ss", "SensorSchema");
sensorcfg.setRootElementName("Sensor");
epService.getEPAdministrator().getConfiguration()
.addEventType("SensorEvent", sensorcfg);The countTags property is now available for querying:
select countTags from SensorEvent
The XPath expression count(...) is a XPath built-in function that counts the number of nodes, for the example document the result is 2.
Esper can parse or cast the result of your XPath expression to the desired type. Your property configuration provides the type to cast to, like this:
sensorcfg.addXPathProperty("countTags", "count(/ss:Sensor/ss:Observation/ss:Tag)",
XPathConstants.NUMBER, "int");The type supplied to the property configuration must be one of the built-in types. Arrays of built-in type are also possible, requiring the XPathConstants.NODESET type returned by your XPath expression, as follows:
sensorcfg.addXPathProperty("idarray", "//ss:Tag/ss:ID",
XPathConstants.NODESET, "String[]");The XPath expression //ss:Tag/ss:ID returns all ID nodes under a Tag node, regardless of where in the node tree the element is located. For the example document the result is 2 array elements urn:epc:1:2.24.400 and urn:epc:1:2.24.40.
An explicit property may return XPathConstants.NODE or XPathConstants.NODESET and can provide the event type name of a pre-configured event type for the property. The method name to add such properties is addXPathPropertyFragment.
This code snippet adds two explicit properties and assigns an event type name for each property:
sensorcfg.addXPathPropertyFragment("tagOne", "//ss:Tag[position() = 1]",
XPathConstants.NODE, "TagEvent");
sensorcfg.addXPathPropertyFragment("tagArray", "//ss:Tag",
XPathConstants.NODESET, "TagEvent");The configuration above references the TagEvent event type. This type must also be configured. Prefix the root element name with "//" to cause the lookup to search the nested schema elements for the definition of the type:
ConfigurationEventTypeXMLDOM tagcfg = new ConfigurationEventTypeXMLDOM();
tagcfg.setRootElementName("//Tag");
tagcfg.setSchemaResource(schemaURL);
epAdministrator.getConfiguration()
.addEventType("TagEvent", tagcfg);The tagOne and tagArray properties are now ready for selection and transposing to further streams:
insert into TagOneStream select tagOne.* from SensorEvent // ... select from the new stream ... select ID from TagOneStream
insert into TagArrayStream select tagArray as mytags from SensorEvent // ... select from the new stream ... select mytags[0].ID from TagArrayStream
Part of the extension and plug-in features of Esper is an event representation API. This set of classes allow an application to create new event types and event instances based on information available elsewhere, statically or dynamically at runtime when EPL statements are created. Please see Section 11.5, “Custom Event Representation” for details.
Creating a plug-in event representation can be useful when your application has existing Java classes that carry event metadata and event property values and your application does not want to (or cannot) extract or transform such event metadata and event data into one of the built-in event representations (POJO Java objects, Map or XML DOM).
Further use of a plug-in event representation is to provide a faster or short-cut access path to event data. For example, access to event data stored in a XML format through the Streaming API for XML (StAX) is known to be very efficient. A plug-in event representation can also provide network lookup and dynamic resolution of event type and dynamic sourcing of event instances.
Currently, EsperIO provides the following additional event representations:
Apache Axiom: Streaming API for XML (StAX) implementation
Please see the EsperIO documentation for details on the above.
The chapter on Section 11.5, “Custom Event Representation” explains how to create your own custom event representation.
To summarize, an event is an immutable record of a past occurrence of an action or state change, and event properties contain useful information about an event.
The length of time an event is of interest to the event processing engine (retention time) depends on your EPL statements, and especially the data window, pattern and output rate limiting clauses of your statements.
During the retention time of an event more information about the event may become available, such as additional properties or changes to existing properties. Esper provides the concept of revision event type to handle such situations. A revision event type instructs the engine that new versions of an existing event are to be processed that update, amend or add to an existing event.
Revision event types work together with the named window feature of the engine: A named window provides an explicit expiration policy for event instances that can be updated, thereby allowing the engine to retain only those versions that are relevant to any retained event and discarding updates to events not retained.
Please see Section 4.17.8, “Updating and Versioning Events in Named Windows” for further instructions and examples.
Note that patterns do not reflect changes to past events. For the temporal nature of patterns, any changes to events that were observed in the past do not reflect upon current pattern state.
Your application events may consist of fairly comprehensive, coarse-grained structures or documents. For example in business-to-business integration scenarios, XML documents or other event objects can be rich deeply-nested graphs of event properties.
To extract information from a coarse-grained event or to perform bulk operations on the rows of the property graph in an event, Esper provides a convenient syntax: When specifying a filter expression in a pattern or in a select clause, it may contain a contained-event selection syntax, as further described in Section 4.19, “Contained-Event Selection”.
The Esper processing model is continuous: Update listeners and/or subscribers to statements receive updated data as soon as the engine processes events for that statement, according to the statement's choice of event streams, views, filters and output rates.
As outlined in Chapter 9, API Reference the interface for listeners is com.espertech.esper.client.UpdateListener. Implementations must provide a single update method that the engine invokes when results become available:

A second, strongly-typed and native, highly-performant method of result delivery is provided: A subscriber object is a direct binding of query results to a Java object. The object, a POJO, receives statement results via method invocation. The subscriber class need not implement an interface or extend a superclass. Please see Section 9.3.3, “Setting a Subscriber Object”.
The engine provides statement results to update listeners by placing results in com.espertech.esper.client.EventBean instances. A typical listener implementation queries the EventBean instances via getter methods to obtain the statement-generated results.

The get method on the EventBean interface can be used to retrieve result columns by name. The property name supplied to the get method can also be used to query nested, indexed or array properties of object graphs as discussed in more detail in Chapter 2, Event Representations and Section 9.5, “Event and Event Type”
The getUnderlying method on the EventBean interface allows update listeners to obtain the underlying event object. For wildcard selects, the underlying event is the event object that was sent into the engine via the sendEvent method. For joins and select clauses with expressions, the underlying object implements java.util.Map.
In this section we look at the output of a very simple EPL statement. The statement selects an event stream without using a data window and without applying any filtering, as follows:
select * from Withdrawal
This statement selects all Withdrawal events. Every time the engine processes an event of type Withdrawal or any sub-type of Withdrawal, it invokes all update listeners, handing the new event to each of the statement's listeners.
The term insert stream denotes the new events arriving, and entering a data window or aggregation. The insert stream in this example is the stream of arriving Withdrawal events, and is posted to listeners as new events.
The diagram below shows a series of Withdrawal events 1 to 6 arriving over time. The number in parenthesis is the withdrawal amount, an event property that is used in the examples that discuss filtering.
The example statement above results in only new events and no old events posted by the engine to the statement's listeners.
A length window instructs the engine to only keep the last N events for a stream. The next statement applies a length window onto the Withdrawal event stream. The statement serves to illustrate the concept of data window and events entering and leaving a data window:
select * from Withdrawal.win:length(5)
The size of this statement's length window is five events. The engine enters all arriving Withdrawal events into the length window. When the length window is full, the oldest Withdrawal event is pushed out the window. The engine indicates to listeners all events entering the window as new events, and all events leaving the window as old events.
While the term insert stream denotes new events arriving, the term remove stream denotes events leaving a data window, or changing aggregation values. In this example, the remove stream is the stream of Withdrawal events that leave the length window, and such events are posted to listeners as old events.
The next diagram illustrates how the length window contents change as events arrive and shows the events posted to an update listener.
As before, all arriving events are posted as new events to listeners. In addition, when event W1 leaves the length window on arrival of event W6, it is posted as an old event to listeners.
Similar to a length window, a time window also keeps the most recent events up to a given time period. A time window of 5 seconds, for example, keeps the last 5 seconds of events. As seconds pass, the time window actively pushes the oldest events out of the window resulting in one or more old events posted to update listeners.
Note: By default the engine only delivers the insert stream to listeners and observers. EPL supports optional istream, irstream and rstream keywords on select-clauses and on insert-into clauses to control which stream to deliver, see Section 4.3.7, “Selecting insert and remove stream events”. There is also a related, engine-wide configuration setting described in Section 10.4.14, “Engine Settings related to Stream Selection”.
Filters to event streams allow filtering events out of a given stream before events enter a data window. The statement below shows a filter that selects Withdrawal events with an amount value of 200 or more.
select * from Withdrawal(amount>=200).win:length(5)
With the filter, any Withdrawal events that have an amount of less then 200 do not enter the length window and are therefore not passed to update listeners. Filters are discussed in more detail in Section 4.4.1, “Filter-based Event Streams” and Section 5.4, “Filter Expressions In Patterns”.
The where-clause and having-clause in statements eliminate potential result rows at a later stage in processing, after events have been processed into a statement's data window or other views.
The next statement applies a where-clause to Withdrawal events. Where-clauses are discussed in more detail in Section 4.5, “Specifying Search Conditions: the Where Clause”.
select * from Withdrawal.win:length(5) where amount >= 200
The where-clause applies to both new events and old events. As the diagram below shows, arriving events enter the window however only events that pass the where-clause are handed to update listeners. Also, as events leave the data window, only those events that pass the conditions in the where-clause are posted to listeners as old events.
The where-clause can contain complex conditions while event stream filters are more restrictive in the type of filters that can be specified. The next statement's where-clause applies the ceil function of the java.lang.Math Java library class in the where clause. The insert-into clause makes the results of the first statement available to the second statement:
insert into WithdrawalFiltered select * from Withdrawal where Math.ceil(amount) >= 200 select * from WithdrawalFiltered
In this section we explain the output model of statements employing a time window view and a time batch view.
A time window is a moving window extending to the specified time interval into the past based on the system time. Time windows enable us to limit the number of events considered by a query, as do length windows.
As a practical example, consider the need to determine all accounts where the average withdrawal amount per account for the last 4 seconds of withdrawals is greater then 1000. The statement to solve this problem is shown below.
select account, avg(amount) from Withdrawal.win:time(4 sec) group by account having amount > 1000
The next diagram serves to illustrate the functioning of a time window. For the diagram, we assume a query that simply selects the event itself and does not group or filter events.
select * from Withdrawal.win:time(4 sec)
The diagram starts at a given time t and displays the contents of the time window at t + 4 and t + 5 seconds and so on.
The activity as illustrated by the diagram:
At time t + 4 seconds an event W1 arrives and enters the time window. The engine reports the new event to update listeners.
At time t + 5 seconds an event W2 arrives and enters the time window. The engine reports the new event to update listeners.
At time t + 6.5 seconds an event W3 arrives and enters the time window. The engine reports the new event to update listeners.
At time t + 8 seconds event W1 leaves the time window. The engine reports the event as an old event to update listeners.
The time batch view buffers events and releases them every specified time interval in one update. Time windows control the evaluation of events, as does the length batch window.
The next diagram serves to illustrate the functioning of a time batch view. For the diagram, we assume a simple query as below:
select * from Withdrawal.win:time_batch(4 sec)
The diagram starts at a given time t and displays the contents of the time window at t + 4 and t + 5 seconds and so on.
The activity as illustrated by the diagram:
At time t + 1 seconds an event W1 arrives and enters the batch. No call to inform update listeners occurs.
At time t + 3 seconds an event W2 arrives and enters the batch. No call to inform update listeners occurs.
At time t + 4 seconds the engine processes the batched events and a starts a new batch. The engine reports events W1 and W2 to update listeners.
At time t + 6.5 seconds an event W3 arrives and enters the batch. No call to inform update listeners occurs.
At time t + 8 seconds the engine processes the batched events and a starts a new batch. The engine reports the event W3 as new data to update listeners. The engine reports the events W1 and W2 as old data (prior batch) to update listeners.
The built-in data windows that act on batches of events are the win:time_batch and the win:length_batch views. The win:time_batch data window collects events arriving during a given time interval and posts collected events as a batch to listeners at the end of the time interval. The win:length_batch data window collects a given number of events and posts collected events as a batch to listeners when the given number of events has collected.
Let's look at how a time batch window may be used:
select account, amount from Withdrawal.win:time_batch(1 sec)
The above statement collects events arriving during a one-second interval, at the end of which the engine posts the collected events as new events (insert stream) to each listener. The engine posts the events collected during the prior batch as old events (remove stream). The engine starts posting events to listeners one second after it receives the first event and thereon.
For statements containing aggregation functions and/or a group by clause, the engine posts consolidated aggregation results for an event batch. For example, consider the following statement:
select sum(amount) as mysum from Withdrawal.win:time_batch(1 sec)
Note that output rate limiting also generates batches of events following the output model as discussed here.
Statements that aggregate events via aggregation functions also post remove stream events as aggregated values change.
Consider the following statement that alerts when 2 Withdrawal events have been received:
select count(*) as mycount from Withdrawal having count(*) = 2
When the engine encounters the second withdrawal event, the engine posts a new event to update listeners. The value of the "mycount" property on that new event is 2. Additionally, when the engine encounters the third Withdrawal event, it posts an old event to update listeners containing the prior value of the count. The value of the "mycount" property on that old event is also 2.
The istream or rstream keyword can be used to eliminate either new events or old events posted to listeners. The next statement uses the istream keyword causing the engine to call the listener only once when the second Withdrawal event is received:
select istream count(*) as mycount from Withdrawal having count(*) = 2
Following SQL (Standard Query Language) standards for queries against relational databases, the presence or absence of aggregation functions and the presence or absence of the group by clause dictates the number of rows posted by the engine to listeners. The next sections outline the output model for batched events under aggregation and grouping. The examples also apply to data windows that don't batch events and post results continously as events arrive or leave data windows. The examples also apply to patterns providing events when a complete pattern matches.
In summary, as in SQL, if your query only selects aggregation values, the engine provides one row of aggregated values. It provides that row every time the aggregation is updated (insert stream), which is when events arrive or a batch of events gets processed, and when the events leave a data window or a new batch of events arrives. The remove stream then consists of prior aggregation values.
Also as in SQL, if your query selects non-aggregated values along with aggregation values in the select clause, the engine provides a row per event. The insert stream then consists of the aggregation values at the time the event arrives, while the remove stream is the aggregation value at the time the event leaves a data window, if any is defined in your query.
The documentation provides output examples for query types in Appendix A, Output Reference and Samples, and the next sections outlines each query type.
An example statement for the un-aggregated and un-grouped case is as follows:
select * from Withdrawal.win:time_batch(1 sec)
At the end of a time interval, the engine posts to listeners one row for each event arriving during the time interval.
The appendix provides a complete example including input and output events over time at Section A.2, “Output for Un-aggregated and Un-grouped Queries”
If your statement only selects aggregation values and does not group, your statement may look as the example below:
select sum(amount) from Withdrawal.win:time_batch(1 sec)
At the end of a time interval, the engine posts to listeners a single row indicating the aggregation result. The aggregation result aggregates all events collected during the time interval.
The appendix provides a complete example including input and output events over time at Section A.3, “Output for Fully-aggregated and Un-grouped Queries”
If your statement selects non-aggregated properties and aggregation values, and does not group, your statement may be similar to this statement:
select account, sum(amount) from Withdrawal.win:time_batch(1 sec)
At the end of a time interval, the engine posts to listeners one row per event. The aggregation result aggregates all events collected during the time interval.
The appendix provides a complete example including input and output events over time at Section A.4, “Output for Aggregated and Un-grouped Queries”
If your statement selects aggregation values and all non-aggregated properties in the select clause are listed in the group by clause, then your statement may look similar to this example:
select account, sum(amount) from Withdrawal.win:time_batch(1 sec) group by account
At the end of a time interval, the engine posts to listeners one row per unique account number. The aggregation result aggregates per unique account.
The appendix provides a complete example including input and output events over time at Section A.5, “Output for Fully-aggregated and Grouped Queries”
If your statement selects non-aggregated properties and aggregation values, and groups only some properties using the group by clause, your statement may look as below:
select account, accountName, sum(amount) from Withdrawal.win:time_batch(1 sec) group by account
At the end of a time interval, the engine posts to listeners one row per event. The aggregation result aggregates per unique account.
The appendix provides a complete example including input and output events over time at Section A.6, “Output for Aggregated and Grouped Queries”
The Event Processing Language (EPL) is a SQL-like language with SELECT, FROM, WHERE, GROUP BY, HAVING and ORDER BY clauses. Streams replace tables as the source of data with events replacing rows as the basic unit of data. Since events are composed of data, the SQL concepts of correlation through joins, filtering and aggregation through grouping can be effectively leveraged. The INSERT INTO clause is recast as a means of forwarding events to other streams for further downstream processing. External data accessible through JDBC may be queried and joined with the stream data. Additional clauses such as the PATTERN and OUTPUT clauses are also available to provide the missing SQL language constructs specific to event processing.
EPL statements are used to derive and aggregate information from one or more streams of events, and to join or merge event streams. This section outlines EPL syntax. It also outlines the built-in views, which are the building blocks for deriving and aggregating information from event streams.
EPL statements contain definitions of one or more views. Similar to tables in a SQL statement, views define the data available for querying and filtering. Some views represent windows over a stream of events. Other views derive statistics from event properties, group events or handle unique event property values. Views can be staggered onto each other to build a chain of views. The Esper engine makes sure that views are reused among EPL statements for efficiency.
The built-in set of views is:
Data window views: win:length, win:length_batch, win:time, win:time_batch, win:time_length_batch, win:time_accum, win:ext_timed, ext:sort_window, ext:time_order, std:unique, std:groupby, std:lastevent, std:firstevent, std:firstunique, win:firstlength, win:firsttime.
Views that derive statistics: std:size, stat:uni, stat:linest, stat:correl, stat:weighted_avg.
EPL provides the concept of named window. Named windows are data windows that can be inserted-into and deleted-from by one or more statements, and that can queried by one or more statements. Named windows have a global character, being visible and shared across an engine instance beyond a single statement. Use the CREATE WINDOW clause to create named windows. Use the INSERT INTO clause to insert data into a named window, the ON DELETE clause to remove events from a named window, and the ON SELECT clause to perform a non-continuous fire-once query on a named window. Finally, the name of the named window can occur in a statement's FROM clause to query a named window or include the named window in a join or subquery.
Variables can come in handy to parameterize statements and change parameters on-the-fly and in response to events. Variables can be used in an expression anywhere in a statement as well as in the output clause for dynamic control of output rates.
Esper can be extended by plugging-in custom developed views and aggregation functions.
EPL queries are created and stored in the engine, and publish results to listeners as events are received by the engine or timer events occur that match the criteria specified in the query. Events can also be obtained from running EPL queries via the safeIterator and iterator methods that provide a pull-data API.
The select clause in an EPL query specifies the event properties or events to retrieve. The from clause in an EPL query specifies the event stream definitions and stream names to use. The where clause in an EPL query specifies search conditions that specify which event or event combination to search for. For example, the following statement returns the average price for IBM stock ticks in the last 30 seconds.
select avg(price) from StockTick.win:time(30 sec) where symbol='IBM'
EPL queries follow the below syntax. EPL queries can be simple queries or more complex queries. A simple select contains only a select clause and a single stream definition. Complex EPL queries can be build that feature a more elaborate select list utilizing expressions, may join multiple streams, may contain a where clause with search conditions and so on.
[insert into insert_into_def] select select_list from stream_def [as name] [, stream_def [as name]] [,...] [where search_conditions] [group by grouping_expression_list] [having grouping_search_conditions] [output output_specification] [order by order_by_expression_list] [limit num_rows]
Time-based windows as well as pattern observers and guards take a time period as a parameter. Time periods follow the syntax below.
time-period : [day-part] [hour-part] [minute-part] [seconds-part] [milliseconds-part] day-part : (number|variable_name) ("days" | "day") hour-part : (number|variable_name) ("hours" | "hour") minute-part : (number|variable_name) ("minutes" | "minute" | "min") seconds-part : (number|variable_name) ("seconds" | "second" | "sec") milliseconds-part : (number|variable_name) ("milliseconds" | "millisecond" | "msec")
Some examples of time periods are:
10 seconds 10 minutes 30 seconds 20 sec 100 msec 1 day 2 hours 20 minutes 15 seconds 110 milliseconds 0.5 minutes
Variable names and substitution parameters '?' for prepared statements are also allowed as part of a time period expression.
Comments can appear anywhere in the EPL or pattern statement text where whitespace is allowed. Comments can be written in two ways: slash-slash (// ...) comments and slash-star (/* ... */) comments.
Slash-slash comments extend to the end of the line:
// This comment extends to the end of the line. // Two forward slashes with no whitespace between them begin such comments. select * from MyEvent // this is a slash-slash comment // All of this text together is a valid statement.
Slash-star comments can span multiple lines:
/* This comment is a "slash-star" comment that spans multiple lines. * It begins with the slash-star sequence with no space between the '/' and '*' characters. * By convention, subsequent lines can begin with a star and are aligned, but this is * not required. */ select * from MyEvent /* this also works */
Comments styles can also be mixed:
select field1, // first comment /* second comment*/ field2 from MyEvent
Certain words such as select, delete or set are reserved and may not be used as identifiers. Please consult Appendix B, Reserved Keywords for the list of reserved keywords and permitted keywords.
Names of built-in functions and certain auxiliary keywords are permitted as event property names and in the rename syntax of the select clause. For example, count is acceptable.
Consider the example below, which assumes that 'last' is an event property of MyEvent:
// valid select last, count(*) as count from MyEvent
This example shows an incorrect use of a reserved keyword:
// invalid select insert from MyEvent
EPL offers an escape syntax for reserved keywords: Event properties as well as event or stream names may be escaped via the backwards apostrophe ` (ASCII 96) character.
The next example queries an event type by name Order (a reserved keyword) that provides a property by name insert (a reserved keyword):
// valid select `insert` from `Order`
EPL honors all Java built-in primitive and boxed types, including java.math.BigInteger and java.math.BigDecimal.
EPL also follows Java standards in terms of widening, performing widening automatically in cases where widening type conversion is allowed without loss of precision, for both boxed and primitive types and including BigInteger and BigDecimal:
byte to short, int, long, float, double, BigInteger or BigDecimal
short to int, long, float, or double, BigInteger or BigDecimal
char to int, long, float, or double, BigInteger or BigDecimal
int to long, float, or double, BigInteger or BigDecimal
long to float or double, BigInteger or BigDecimal
float to double or BigDecimal
double to BigDecimal
In cases where loss of precision is possible because of narrowing requirements, EPL compilation outputs a compilation error.
EPL supports casting via the cast function.
EPL returns double-type values for division regardless of operand type. EPL can also be configured to follow Java rules for integer arithmetic instead as described in Section 10.4.18, “Engine Settings related to Expression Evaluation”.
Division by zero returns positive or negative infinity. Division by zero can be configured to return null instead.
An EPL constant is a number or a character string that indicates a fixed value. Constants can be used as expressions in many EPL statements, including variable assignment and case-when statements. They can also be used as parameter values for many built-in objects and clauses. Constants are also called literals.
EPL supports the standard SQL constant notation as well as Java data type literals.
The following are types of EPL constants:
Table 4.1. Types of EPL constants
| Type | Description | Examples |
|---|---|---|
| string | A single character to an unlimited number of characters. Valid delimiters are the single quote (') or double quote ("). | select 'volume' as field1, "sleep" as field2, "\u0041" as unicodeA |
| boolean | A boolean value. | select true as field1, false as field2 |
| integer | An integer value (4 byte). | select 1 as field1, -1 as field2, 1e2 as field3 |
| long | A long value (8 byte). Use the "L" or "l" (lowercase L) suffix. | select 1L as field1, 1l as field2 |
| double | A double-precision 64-bit IEEE 754 floating point. | select 1.67 as field1, 167e-2 as field2, 1.67d as field3 |
| float | A single-precision 32-bit IEEE 754 floating point. Use the "f" suffix. | select 1.2f as field1, 1.2F as field2 |
| byte | A 8-bit signed two's complement integer. | select 0x10 as field1 |
EPL does not have a single-byte character data type for its literals. Single character literals are treated as string.
Internal byte representation and boundary values of constants follow the Java standard.
EPL automatically performs widening of numbers to BigInteger and BigDecimal as required, and employs the respective equals, compareTo and arithmetic methods provided by BigInteger and BigDecimal.
To explicitly create BigInteger and BigDecimal constants in EPL, please use the cast syntax : cast(value, BigInteger).
Note that since BigDecimal.valueOf(1.0) is not the same as BigDecimal.valueOf(1) (in terms of equality through equals), care should be taken towards the consistent use of scale.
When using aggregation functions for BigInteger and BigDecimal values, please note these limitations:
The median, stddev and avedev aggregation functions operate on the double value of the object and return a double value.
All other aggregation functions return BigDecimal or BigInteger values (except count).
The select clause is required in all EPL statements. The select clause can be used to select all properties via the wildcard *, or to specify a list of event properties and expressions. The select clause defines the event type (event property names and types) of the resulting events published by the statement, or pulled from the statement via the iterator methods.
The select clause also offers optional istream, irstream and rstream keywords to control whether input stream, remove stream or input and remove stream events are posted to UpdateListener instances and observers to a statement. By default, the engine provides only the insert stream to listener and observers. See Section 10.4.14, “Engine Settings related to Stream Selection” on how to change the default.
The syntax for the select clause is summarized below.
select [istream | irstream | rstream] * | expression_list ... The istream keyword is the default, and indicates that the engine only delivers insert stream events to listeners and observers. The irstream keyword indicates that the engine delivers both insert and remove stream. Finally, the rstream keyword tells the engine to deliver only the remove stream.
The syntax for selecting all event properties in a stream is:
select * from stream_defThe following statement selects StockTick events for the last 30 seconds of IBM stock ticks.
select * from StockTick(symbol='IBM').win:time(30 sec)
The * wildcard and expressions can also be combined in a select clause. The combination selects all event properties and in addition the computed values as specified by any additional expressions that are part of the select clause. Here is an example that selects all properties of stock tick events plus a computed product of price and volume that the statement names 'pricevolume':
select *, price * volume as pricevolume from StockTick(symbol='IBM')
When using wildcard (*), Esper does not actually copy your event properties out of your event or events. It simply wraps your native type in an EventBean interface. Your application has access to the underlying event object through the getUnderlying method and has access to the property values through the get method.
In a join statement, using the select * syntax selects one event property per stream to hold the event for that stream. The property name is the stream name in the from clause.
To choose the particular event properties to return:
select event_property [, event_property] [, ...] from stream_def
The following statement simply selects the symbol and price properties of stock ticks, and the total volume for stock tick events in a 60-second time window.
select symbol, price, sum(volume) from StockTick(symbol='IBM').win:time(60 sec)
The following statement declares a further view onto the event stream of stock ticks: the univariate statistics view (stat:uni). The statement selects the properties that this view derives from the stream, for the last 100 events of IBM stock ticks in the length window.
select datapoints, total, average, variance, stdev, stdevpa from StockTick(symbol='IBM').win:length(100).stat:uni(volume)
The select clause can contain one or more expressions.
select expression [, expression] [, ...] from stream_def
The following statement selects the volume multiplied by price for a time batch of the last 30 seconds of stock tick events.
select volume * price from StockTick.win:time_batch(30 sec)
Event properties and expressions can be renamed using below syntax.
select [event_property | expression] as identifier [, ...]
The following statement selects volume multiplied by price and specifies the name volPrice for the resulting column.
select volume * price as volPrice from StockTick.win:length(100)
Identifiers cannot contain the "." (dot) character, i.e. "vol.price" is not a valid identifier for the rename syntax.
If your statement is joining multiple streams, your may specify property names that are unique among the joined streams, or use wildcard (*) as explained earlier.
In case the property name in your select or other clauses is not unique considering all joined streams, you will need to use the name of the stream as a prefix to the property.
This example is a join between the two streams StockTick and News, respectively named as 'tick' and 'news'. The example selects from the StockTick event the symbol value using the 'tick' stream name as a prefix:
select tick.symbol from StockTick.win:time(10) as tick, News.win:time(10) as news
Use the wildcard (*) selector in a join to generate a property for each stream, with the property value being the event itself. The output events of the statement below have two properties: the 'tick' property holds the StockTick event and the 'news' property holds the News event:
select * from StockTick.win:time(10) as tick, News.win:time(10) as news
The following syntax can also be used to specify what stream's properties to select:
select stream_name.* [as name] from ...
The selection of tick.* selects the StockTick stream events only:
select tick.* from StockTick.win:time(10) as tick, News.win:time(10) as news where tick.symbol = news.symbol
The next example uses the as keyword to name each stream's joined events. This instructs the engine to create a property for each named event:
select tick.* as stocktick, news.* as news from StockTick.win:time(10) as tick, News.win:time(10) as news where stock.symbol = news.symbol
The output events of the above example have two properties 'stocktick' and 'news' that are the StockTick and News events.
The stream name itself, as further described in Section 4.4.5, “Using the Stream Name”, may be used within expressions or alone.
This example passes events to a user-defined function named compute and also shows insert-into to populate an event stream of combined events:
insert into TickNewStream select tick, news, MyLib.compute(news, tick) as result from StockTick.win:time(10) as tick, News.win:time(10) as news where tick.symbol = news.symbol // second statement that uses the TickNewStream stream select tick.price, news.text, result from TickNewStream
In summary, the stream_name.* streamname wildcard syntax can be used to select a stream as the underlying event or as a property, but cannot appear within an expression. While the stream_name syntax (without wildcard) always selects a property (and not as an underlying event), and can occur anywhere within an expression.
If your statement employs pattern expressions, then your pattern expression tags events with a tag name. Each tag name becomes available for use as a property in the select clause and all other clauses.
For example, here is a very simple pattern that matches on every StockTick event received within 30 seconds after start of the statement. The sample selects the symbol and price properties of the matching events:
select tick.symbol as symbol, tick.price as price from pattern[every tick=StockTick where timer:within(10 sec)]
The use of the wildcard selector, as shown in the next statement, creates a property for each tagged event in the output. The next statement outputs events that hold a single 'tick' property whose value is the event itself:
select * from pattern[every tick=StockTick where timer:within(10 sec)]
You may also select the matching event itself using the tick.* syntax. The engine outputs the StockTick event itself to listeners:
select tick.* from pattern[every tick=StockTick where timer:within(10 sec)]
A tag name as specified in a pattern is a valid expression itself. This example uses the insert into clause to make available the events matched by a pattern to further statements:
// make a new stream of ticks and news available
insert into StockTickAndNews
select tick, news from pattern [every tick=StockTick -> news=News(symbol=tick.symbol)]
// second statement to select from the stream of ticks and news
select tick.symbol, tick.price, news.text from StockTickAndNewsThe optional istream, irstream and rstream keywords in the select clause control the event streams posted to listeners and observers to a statement.
If neither keyword is specified, and in the default engine configuration, the engine posts only insert stream events via the newEvents parameter to the update method of UpdateListener instances listening to the statement. The engine does not post remove stream events, by default.
The insert stream consists of the events entering the respective window(s) or stream(s) or aggregations, while the remove stream consists of the events leaving the respective window(s) or the changed aggregation result. See Chapter 3, Processing Model for more information on insert and remove streams.
The engine posts remove stream events to the oldEvents parameter of the update method only if either the irstream or the rstream keyword occurs in the select clause. This behavior can be changed via engine-wide configuration as described in Section 10.4.14, “Engine Settings related to Stream Selection”.
By specifying the istream keyword you can instruct the engine to only post insert stream events via the newEvents parameter to the update method on listeners. The engine will then not post any remove stream events, and the oldEvents parameter is always a null value.
By specifying the irstream keyword you can instruct the engine to post both insert stream and remove stream events.
By specifying the rstream keyword you can instruct the engine to only post remove stream events via the newEvents parameter to the update method on listeners. The engine will then not post any insert stream events, and the oldEvents parameter is also always a null value.
The following statement selects only the events that are leaving the 30 second time window.
select rstream * from StockTick.win:time(30 sec)
The istream and rstream keywords in the select clause are matched by same-name keywords available in the insert into clause. While the keywords in the select clause control the event stream posted to listeners to the statement, the same keywords in the insert into clause specify the event stream that the engine makes available to other statements.
Property or column names can optionally be qualified by a stream name and the provider URI. The syntax is:
[[provider_URI.]stream_name.]property_name
The provider_URI is the URI supplied to the EPServiceProviderManager class, or the string default for the default provider.
This example assumes the provider is the default provider:
select MyEvent.myProperty from MyEvent // ... equivalent to ... select default.MyEvent.myProperty from MyEvent
Stream names can also be qualified by the provider URI. The syntax is:
[provider_URI.]stream_name
The next example assumes a provider URI by name of Processor:
select Processor.MyEvent.myProperty from Processor.MyEvent
The from clause is required in all EPL statements. It specifies one or more event streams or named windows. Each event stream or named window can optionally be given a name by means of the as keyword.
from stream_def [as name] [unidirectional] [retain-union | retain-intersection] [, stream_def [as stream_name]] [, ...]
The event stream definition stream_def as shown in the syntax above can consists of either a filter-based event stream definition or a pattern-based event stream definition.
For joins and outer joins, specify two or more event streams. Joins between pattern-based and filter-based event streams are also supported. Joins and the unidirectional keyword are described in more detail in Section 4.11, “Joining Event Streams”.
Esper supports joins against relational databases for access to historical or reference data as explained in Section 4.15, “Accessing Relational Data via SQL”. Esper can also join results returned by an arbitrary method invocation, as discussed in Section 4.16, “Accessing Non-Relational Data via Method Invocation”.
The stream_name is an optional identifier assigned to the stream. The stream name can itself occur in any expression and provides access to the event itself from the named stream. Also, a stream name may be combined with a method name to invoke instance methods on events of that stream.
For all streams with the exception of historical sources your query may employ data window views as outlined below. The retain-intersection (the default) and retain-union keywords build a union or intersection of two or more data windows as described in Section 4.4.4, “Multiple Data Window Views”.
The stream_def syntax for a filter-based event stream is as below:
event_stream_name [(filter_criteria)] [contained_selection] [.view_spec] [.view_spec] [...]
The event_stream_name is either the name of an event type or name of an event stream populated by an insert into statement or the name of a named window.
The filter_criteria is optional and consists of a list of expressions filtering the events of the event stream, within parenthesis after the event stream name.
The contained_selection is optional and is for use with coarse-grained events that have properties that are themselves one or more events, see Section 4.19, “Contained-Event Selection” for the synopsis and examples.
The view_spec are optional view specifications, which are combinable definitions for retaining events and for deriving information from events.
The following EPL statement shows event type, filter criteria and views combined in one statement. It selects all event properties for the last 100 events of IBM stock ticks for volume. In the example, the event type is the fully qualified Java class name org.esper.example.StockTick. The expression filters for events where the property symbol has a value of "IBM". The optional view specifications for deriving data from the StockTick events are a length window and a view for computing statistics on volume. The name for the event stream is "volumeStats".
select * from org.esper.example.StockTick(symbol='IBM').win:length(100).stat:uni(volume) as volumeStats
Esper filters out events in an event stream as defined by filter criteria before it sends events to subsequent views. Thus, compared to search conditions in a where clause, filter criteria remove unneeded events early. In the above example, events with a symbol other then IBM do not enter the time window.
The simplest form of filter is a filter for events of a given type without any conditions on the event property values. This filter matches any event of that type regardless of the event's properties. The example below is such a filter.
select * from com.mypackage.myevents.RfidEvent
Instead of the fully-qualified Java class name any other event name can be mapped via Configuration to a Java class, making the resulting statement more readable:
select * from RfidEvent
Interfaces and superclasses are also supported as event types. In the below example IRfidReadable is an interface class.
select * from org.myorg.rfid.IRfidReadable
The filtering criteria to filter for events with certain event property values are placed within parenthesis after the event type name:
select * from RfidEvent(category="Perishable")
All expressions can be used in filters, including static methods that return a boolean value:
select * from com.mycompany.RfidEvent(MyRFIDLib.isInRange(x, y) or (x < 0 and y < 0))
Filter expressions can be separated via a single comma ','. The comma represents a logical AND between filter expressions:
select * from RfidEvent(zone=1, category=10) ...is equivalent to... select * from RfidEvent(zone=1 and category=10)
The following operators are highly optimized through indexing and are the preferred means of filtering in high-volume event streams:
equals =
not equals !=
comparison operators < , > , >=, <=
ranges
use the between keyword for a closed range where both endpoints are included
use the in keyword and round () or square brackets [] to control how endpoints are included
for inverted ranges use the not keyword and the between or in keywords
list-of-values checks using the in keyword or the not in keywords followed by a comma-separated list of values
At compile time as well as at run time, the engine scans new filter expressions for sub-expressions that can be indexed. Indexing filter values to match event properties of incoming events enables the engine to match incoming events faster. The above list of operators represents the set of operators that the engine can best convert into indexes. The use of comma or logical and in filter expressions does not impact optimizations by the engine.
Ranges come in the following 4 varieties. The use of round () or square [] bracket dictates whether an endpoint is included or excluded. The low point and the high-point of the range are separated by the colon : character.
Open ranges that contain neither endpoint (low:high)
Closed ranges that contain both endpoints [low:high]. The equivalent 'between' keyword also defines a closed range.
Half-open ranges that contain the low endpoint but not the high endpoint [low:high)
Half-closed ranges that contain the high endpoint but not the low endpoint (low:high]
The next statement shows a filter specifying a range for x and y values of RFID events. The range includes both endpoints therefore uses [] hard brackets.
mypackage.RfidEvent(x in [100:200], y in [0:100])
The between keyword is equivalent for closed ranges. The same filter using the between keyword is:
mypackage.RfidEvent(x between 100 and 200, y between 0 and 50)
The not keyword can be used to determine if a value falls outside a given range:
mypackage.RfidEvent(x not in [0:100])
The equivalent statement using the between keyword is:
mypackage.RfidEvent(x not between 0 and 100)
The in keyword for filter criteria determines if a given value matches any value in a list of values.
In this example we are interested in RFID events where the category matches any of the given values:
mypackage.RfidEvent(category in ('Perishable', 'Container'))By using the not in keywords we can filter events with a property value that does not match any of the values in a list of values:
mypackage.RfidEvent(category not in ('Household', 'Electrical'))The following restrictions apply to filter criteria:
Range and comparison operators require the event property to be of a numeric type.
Aggregation functions are not allowed within filter expressions.
The prev previous event function and the prior prior event function cannot be used in filter expressions.
Event pattern expressions can also be used to specify one or more event streams in an EPL statement. For pattern-based event streams, the event stream definition stream_def consists of the keyword pattern and a pattern expression in brackets []. The syntax for an event stream definition using a pattern expression is below. As in filter-based event streams, an optional list of views that derive data from the stream can be supplied.
pattern [pattern_expression] [.view_spec] [.view_spec] [...]
The next statement specifies an event stream that consists of both stock tick events and trade events. The example tags stock tick events with the name "tick" and trade events with the name "trade".
select * from pattern [every tick=StockTickEvent or every trade=TradeEvent]
This statement generates an event every time the engine receives either one of the event types. The generated events resemble a map with "tick" and "trade" keys. For stock tick events, the "tick" key value is the underlying stock tick event, and the "trade" key value is a null value. For trade events, the "trade" key value is the underlying trade event, and the "tick" key value is a null value.
Lets further refine this statement adding a view the gives us the last 30 seconds of either stock tick or trade events. Lets also select prices and a price total.
select tick.price as tickPrice, trade.price as tradePrice,
sum(tick.price) + sum(trade.price) as total
from pattern [every tick=StockTickEvent or every trade=TradeEvent].win:time(30 sec)Note that in the statement above tickPrice and tradePrice can each be null values depending on the event processed. Therefore, an aggregation function such as sum(tick.price + trade.price)) would always return null values as either of the two price properties are always a null value for any event matching the pattern. Use the coalesce function to handle null values, for example: sum(coalesce(tick.price, 0) + coalesce(trade.price, 0)).
Views are used to derive or aggregate data. Views can be staggered onto each other. See the section Chapter 8, EPL Reference: Views on the views available that also outlines the different types of views: Data Window views and Derived-Value views.
Views can optionally take one or more parameters. These parameters are expressions themselves that may consists of any combination of variables, arithmatics, user-define function or substituion parameters for prepared statements, as example.
The below example serves to show views and staggering of views. It uses a car location event that contains information about the location of a car on a highway.
select * from CarLocEvent.std:groupby(carId).win:length(4). std:groupby(expressway, direction, segment).std:size()
The first view std:groupby(carId) groups car location events by car id. The second view win:length(4) keeps a length window of the 4 last events, with one length window for each car id. The next view std:groupby(expressway, direction, segment) groups each event by its expressway, direction and segment property values. Again, the grouping is done for each car id considering the last 4 events only. The last view std:size() is used to report the number of events. Thus the below example reports the number of events per car id and per expressway, direction and segment considering the last 4 events for each car id only.
When views are staggered onto each other as a chain of views, then the insert and remove stream received by each view is the insert and remove stream made available by the view (or stream) earlier in the chain.
The special keep-all view keeps all events: It does not provide a remove stream, i.e. events are not removed from the keep-all view unless by means of the on-delete syntax or by revision events.
Data window views provide an expiry policy that indicates when to remove events from the data window, with the exception of the keep-all data window which has no expiry policy and the group-by view for allocating a new data window per group.
EPL allows the freedom to use multiple data window views onto a stream and thus combine expiry policies. Combining data windows into a intersection (the default) or a union can achieve a useful strategy for retaining events and expiring events that are no longer of interest. Named windows and the on-delete syntax provide an additional degree of freedom.
In order to combine two or more data window views there is no keyword required. The retain-intersection keyword is the default and the retain-union keyword may instead be provided for a stream.
The concept of union and intersection come from Set mathematics. In the language of Set mathematics, two sets A and B can be "added" together: The intersection of A and B is the set of all things which are members of both A and B, i.e. the members two sets have "in common". The union of A and B is the set of all things which are members of either A or B.
Use the retain-intersection (the default) keyword to retain an intersection of all events as defined by two or more data windows. All events removed from any of the intersected data windows are entered into the remove stream. This is the default behavior if neither retain keyword is specified.
Use the retain-union keyword to retain a union of all events as defined by two or more data windows. Only events removed from all data windows are entered into the remove stream.
As you can see, it is the remove stream that the combined multiple data windows provide which differs when retaining an intersection and retaining a union, the insert stream is the same to all data windows and their staggered views.
The next example statement totals the price of OrderEvent events in a union of the last 30 seconds and unique by product name:
select sum(price) from OrderEvent.win:time(30 sec).std:unique(productName) retain-union
In the above statement, all OrderEvent events that are either less then 30 seconds old or that are the last event for the product name are considered.
Here is an example statement totals the price of OrderEvent events in an intersection of the last 30 seconds and unique by product name:
select sum(price) from OrderEvent.win:time(30 sec).std:unique(productName) retain-intersection
In the above statement, only those OrderEvent events that are both less then 30 seconds old and are the last event for the product name are considered.
For advanced users and for backward compatibility, it is possible to configure Esper to allow multiple data window views without either of the retain keywords, as described in Section 10.4.11.2, “Configuring Multi-Expiry Policy Defaults”.
Your from clause may assign a name to each stream. This assigned stream name can serve any of the following purposes.
First, the stream name can be used to disambiguate property names. The stream_name.property_name syntax uniquely identifies which property to select if property names overlap between streams. Here is an example:
select prod.productId, ord.productId from ProductEvent as prod, OrderEvent as ord
Second, the stream name can be used with a wildcard (*) character to select events in a join, or assign new names to the streams in a join:
// Select ProductEvent only select prod.* from ProductEvent as prod, OrderEvent // Assign column names 'product' and 'order' to each event select prod.* as product, ord.* as order from ProductEvent as prod, OrderEvent as ord
Further, the stream name by itself can occur in any expression: The engine passes the event itself to that expression. For example, the engine passes the ProductEvent and the OrderEvent to the user-defined function 'checkOrder':
select prod.productId, MyFunc.checkOrder(prod, ord) from ProductEvent as prod, OrderEvent as ord
Last, you may invoke an instance method on each event of a stream, and pass parameters to the instance method as well. Instance method calls are allowed anywhere in an expression.
The next statement demonstrates this capability by invoking a method 'computeTotal' on OrderEvent events and a method 'getMultiplier' on ProductEvent events:
select ord.computeTotal(prod.getMultiplier()) from ProductEvent as prod, OrderEvent as ord
The where clause is an optional clause in EPL statements. Via the where clause event streams can be joined and events can be filtered.
Comparison operators =, < , > , >=, <=, !=, <>, is null, is not null and logical combinations via and and or are supported in the where clause. The where clause can also introduce join conditions as outlined in Section 4.11, “Joining Event Streams”. where clauses can also contain expressions. Some examples are listed below.
...where fraud.severity = 5 and amount > 500 ...where (orderItem.orderId is null) or (orderItem.class != 10) ...where (orderItem.orderId = null) or (orderItem.class <> 10) ...where itemCount / packageCount > 10
The aggregate functions are sum, avg, count, max, min, median, stddev, avedev. You can use aggregate functions to calculate and summarize data from event properties. For example, to find out the total price for all stock tick events in the last 30 seconds, type:
select sum(price) from StockTickEvent.win:time(30 sec)
Here is the syntax for aggregate functions:
aggregate_function( [all | distinct] expression)
You can apply aggregate functions to all events in an event stream window or other view, or to one or more groups of events. From each set of events to which an aggregate function is applied, Esper generates a single value.
Expression is usually an event property name. However it can also be a constant, function, or any combination of event property names, constants, and functions connected by arithmetic operators.
For example, to find out the average price for all stock tick events in the last 30 seconds if the price was doubled:
select avg(price * 2) from StockTickEvent.win:time(30 seconds)
You can use the optional keyword distinct with all aggregate functions to eliminate duplicate values before the aggregate function is applied. The optional keyword all which performs the operation on all events is the default.
You can use aggregation functions in a select clause and in a having clause. You cannot use aggregate functions in a where clause, but you can use the where clause to restrict the events to which the aggregate is applied. The next query computes the average and sum of the price of stock tick events for the symbol IBM only, for the last 10 stock tick events regardless of their symbol.
select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice from StockTickEvent.win:length(10) where symbol='IBM'
In the above example the length window of 10 elements is not affected by the where clause, i.e. all events enter and leave the length window regardless of their symbol. If we only care about the last 10 IBM events, we need to add filter criteria as below.
select 'IBM stats' as title, avg(price) as avgPrice, sum(price) as sumPrice from StockTickEvent(symbol='IBM').win:length(10) where symbol='IBM'
You can use aggregate functions with any type of event property or expression, with the following exceptions:
You can use sum, avg, median, stddev, avedev with numeric event properties only
Esper ignores any null values returned by the event property or expression on which the aggregate function is operating, except for the count(*) function, which counts null values as well. All aggregate functions return null if the data set contains no events, or if all events in the data set contain only null values for the aggregated expression.
The group by clause is optional in all EPL statements. The group by clause divides the output of an EPL statement into groups. You can group by one or more event property names, or by the result of computed expressions. When used with aggregate functions, group by retrieves the calculations in each subgroup. You can use group by without aggregate functions, but generally that can produce confusing results.
For example, the below statement returns the total price per symbol for all stock tick events in the last 30 seconds:
select symbol, sum(price) from StockTickEvent.win:time(30 sec) group by symbol
The syntax of the group by clause is:
group by arregate_free_expression [, arregate_free_expression] [, ...]
Esper places the following restrictions on expressions in the group by clause:
Expressions in the group by cannot contain aggregate functions
Event properties that are used within aggregate functions in the select clause cannot also be used in a group by expression
You can list more then one expression in the group by clause to nest groups. Once the sets are established with group by the aggregation functions are applied. This statement posts the median volume for all stock tick events in the last 30 seconds per symbol and tick data feed. Esper posts one event for each group to statement listeners:
select symbol, tickDataFeed, median(volume) from StockTickEvent.win:time(30 sec) group by symbol, tickDataFeed
In the statement above the event properties in the select list (symbol, tickDataFeed) are also listed in the group by clause. The statement thus follows the SQL standard which prescribes that non-aggregated event properties in the select list must match the group by columns.
Esper also supports statements in which one or more event properties in the select list are not listed in the group by clause. The statement below demonstrates this case. It calculates the standard deviation for the last 30 seconds of stock ticks aggregating by symbol and posting for each event the symbol, tickDataFeed and the standard deviation on price.
select symbol, tickDataFeed, stddev(price) from StockTickEvent.win:time(30 sec) group by symbol
The above example still aggregates the price event property based on the symbol, but produces one event per incoming event, not one event per group.
Additionally, Esper supports statements in which one or more event properties in the group by clause are not listed in the select list. This is an example that calculates the mean deviation per symbol and tickDataFeed and posts one event per group with symbol and mean deviation of price in the generated events. Since tickDataFeed is not in the posted results, this can potentially be confusing.
select symbol, avedev(price) from StockTickEvent.win:time(30 sec) group by symbol, tickDataFeed
Expressions are also allowed in the group by list:
select symbol * price, count(*) from StockTickEvent.win:time(30 sec) group by symbol * price
If the group by expression resulted in a null value, the null value becomes its own group. All null values are aggregated into the same group. If you are using the count(expression) aggregate function which does not count null values, the count returns zero if only null values are encountered.
You can use a where clause in a statement with group by. Events that do not satisfy the conditions in the where clause are eliminated before any grouping is done. For example, the statement below posts the number of stock ticks in the last 30 seconds with a volume larger then 100, posting one event per group (symbol).
select symbol, count(*) from StockTickEvent.win:time(30 sec) where volume > 100 group by symbol
Use the having clause to pass or reject events defined by the group-by clause. The having clause sets conditions for the group by clause in the same way where sets conditions for the select clause, except where cannot include aggregate functions, while having often does.
This statement is an example of a having clause with an aggregate function. It posts the total price per symbol for the last 30 seconds of stock tick events for only those symbols in which the total price exceeds 1000. The having clause eliminates all symbols where the total price is equal or less then 1000.
select symbol, sum(price) from StockTickEvent.win:time(30 sec) group by symbol having sum(price) > 1000
To include more then one condition in the having clause combine the conditions with and, or or not. This is shown in the statement below which selects only groups with a total price greater then 1000 and an average volume less then 500.
select symbol, sum(price), avg(volume) from StockTickEvent.win:time(30 sec) group by symbol having sum(price) > 1000 and avg(volume) < 500
Esper places the following restrictions on expressions in the having clause:
Any expressions that contain aggregate functions must also occur in the select clause
A statement with the having clause should also have a group by clause. If you omit group-by, all the events not excluded by the where clause return as a single group. In that case having acts like a where except that having can have aggregate functions.
The having clause can also be used without group by clause as the below example shows. The example below posts events where the price is less then the current running average price of all stock tick events in the last 30 seconds.
select symbol, price, avg(price) from StockTickEvent.win:time(30 sec) having price < avg(price)
When you include filters, the where condition, the group by clause and the having condition in an EPL statement the sequence in which each clause affects events determines the final result:
The event stream's filter condition, if present, dictates which events enter a window (if one is used). The filter discards any events not meeting filter criteria.
The where clause excludes events that do not meet its search condition.
Aggregate functions in the select list calculate summary values for each group.
The having clause excludes events from the final results that do not meet its search condition.
The following query illustrates the use of filter, where, group by and having clauses in one statement with a select clause containing an aggregate function.
select tickDataFeed, stddev(price) from StockTickEvent(symbol='IBM').win:length(10) where volume > 1000 group by tickDataFeed having stddev(price) > 0.8
Esper filters events using the filter criteria for the event stream StockTickEvent. In the example above only events with symbol IBM enter the length window over the last 10 events, all other events are simply discarded. The where clause removes any events posted by the length window (events entering the window and event leaving the window) that do not match the condition of volume greater then 1000. Remaining events are applied to the stddev standard deviation aggregate function for each tick data feed as specified in the group by clause. Each tickDataFeed value generates one event. Esper applies the having clause and only lets events pass for tickDataFeed groups with a standard deviation of price greater then 0.8.
The group by clause as well as the built-in std:groupby view are similar in their ability to group events. This section explains the key differences in their behavior and use.
The group by clause works together with aggregation functions in your statement to produce an aggregation result per group. In greater detail, this means that when a new event arrives, the engine applies the expressions in the group by clause to determine a grouping key. If the engine has not encountered that grouping key before (a new group), the engine creates a set of new aggregation results for that grouping key and performs the aggregation changing that new set of aggregation results. If the grouping key points to an existing set of prior aggregation results (an existing group), the engine performs the aggregation changing the prior set of aggregation results for that group.
The std:groupby view is a built-in view that also groups events. The view is described in greater detail in Section 8.2.2, “Group-By (std:groupby)”. Its primary use is to create a separate data window per group, or more generally to create separate instances of all its sub-views for each grouping key encountered.
The next example shows two queries that produce equivalent results. The query using the group by clause is generally preferable as is easier to read. The second form introduces the stat:uni view which computes univariate statistics for a given property:
select symbol, avg(price) from StockTickEvent group by symbol // ... is equivalent to ... select symbol, average from StockTickEvent.std:groupby(symbol).stat:uni(price)
The next example shows two queries that are NOT equivalent as the length window is ungrouped in the first query, and grouped in the second query:
select symbol, sum(price) from StockTickEvent.win:length(10) group by symbol // ... NOT equivalent to ... select symbol, sum(price) from StockTickEvent.std:groupby(symbol).win:length(10)
The key difference between the two statements is that in the first statement the length window is ungrouped and applies to all events regardless of group. While in the second query each group gets its own instance of a length window. For example, in the second query events arriving for symbol "ABC" get a length window of 10 events, and events arriving for symbol "DEF" get their own length window of 10 events.
The output clause is optional in Esper and is used to control or stabilize the rate at which events are output. The EPL language provides for several different ways to control output rate.
Here is the syntax for the output clause that specifies a rate in time interval or number of events:
output [all | first | last | snapshot] every output_rate [minutes | seconds | events]An alternate syntax specifies the time period between output as outlined in Section 4.2.1, “Specifying Time Periods” :
output [all | first | last | snapshot] every time_periodA crontab-like schedule can also be specified. The schedule parameters follow the pattern observer parameters and are further described in Section 5.6.2.2, “timer:at” :
output [all | first | last | snapshot] at
(minutes, hours, days of month, months, days of week [, seconds])Last, output can be controlled by an expression that may contain variables, user-defined functions and information about the number of collected events. Output that is controlled by an expression is discussed in detail below.
For example, the following statement outputs, every 60 seconds, the total price for all orders in the 30-minute time window:
select sum(price) from OrderEvent.win:time(30 min) output snapshot every 60 seconds
The all keyword is the default and specifies that all events in a batch should be output, each incoming row in the batch producing an output row. Note that for statements that group via the group by clause, the all keyword provides special behavior as below.
The first keyword specifies that only the first event in an output batch is to be output. Using the first keyword instructs the engine to output the first matching event as soon as it arrives, and then ignores matching events for the time interval or number of events specified. After the time interval elapsed, or the number of matching events has been reached, the next first matching event is output again and the following interval the engine again ignores matching events.
The last keyword specifies to only output the last event at the end of the given time interval or after the given number of matching events have been accumulated. Again, for statements that group via the group by clause the last keyword provides special behavior as below.
The snapshot keyword indicates that the engine output current computation results considering all events as per views specified and/or current aggregation results. While the other keywords control how a batch of events between output intervals is being considered, the snapshot keyword outputs all current state of a statement independent of the last batch. It's output is equivalent to the iterator method provided by a statement.
The output_rate is the frequency at which the engine outputs events. It can be specified in terms of time or number of events. The value can be a number to denote a fixed output rate, or the name of a variable whose value is the output rate. By means of a variable the output rate can be controlled externally and changed dynamically at runtime.
Please consult the Appendix A, Output Reference and Samples for detailed information on insert and remove stream output for the various output clause keywords.
The time interval can also be specified in terms of minutes; the following statement is identical to the first one.
select * from StockTickEvent.win:length(5) output every 1.5 minutes
A second way that output can be stabilized is by batching events until a certain number of events have been collected. The next statement only outputs when either 5 (or more) new or 5 (or more) old events have been batched.
select * from StockTickEvent.win:time(30 sec) output every 5 events
Additionally, event output can be further modified by the optional last keyword, which causes output of only the last event to arrive into an output batch.
select * from StockTickEvent.win:time(30 sec) output last every 5 events
Using the first keyword you can be notified at the start of the interval. The allows to watch for situations such as a rate falling below a threshold and only be informed every now and again after the specified output interval, but be informed the moment it first happens.
select * from TickRate.win:time(30 seconds) where rate<100 output first every 60 seconds
A sample statement using the Unix "crontab"-command schedule is shown next. See Section 5.6.2.2, “timer:at” for details on schedule syntax. Here, output occurs every 15 minutes from 8am to 5:45pm (hours 8 to 17 at 0, 15, 30 and 45 minutes past the hour):
select symbol, sum(price) from StockTickEvent group by symbol output at (*/15, 8:17, *, *, *)
Output can also be controlled by an expression that may check variable values, use user-defined functions and query built-in properties that provide additional information. The synopsis is as follows:
output [all | first | last | snapshot] when trigger_expression [then set variable_name = assign_expression [, variable_name = assign_expression [,...]]]
The when keyword must be followed by a trigger expression returning a boolean value of true or false, indicating whether to output. Use the optional then keyword to change variable values after the trigger expression evaluates to true. An assignment expression assigns a new value to variable(s).
Lets consider an example. The next statement assumes that your application has defined a variable by name OutputTriggerVar of boolean type. The statement outputs rows only when the OutputTriggerVar variable has a boolean value of true:
select sum(price) from StockTickEvent output when OutputTriggerVar = true
The engine evaluates the trigger expression when streams and data views post one or more insert or remove stream events after considering the where clause, if present. It also evaluates the trigger expression when any of the variables used in the trigger expression, if any, changes value. Thus output occurs as follows:
When there are insert or remove stream events and the when trigger expression evaluates to true, the engine outputs the resulting rows.
When any of the variables in the when trigger expression changes value, the engine evaluates the expression and outputs results. Result output occurs within the minimum time interval of timer resolution (100 milliseconds).
By adding a then part to the EPL, we can reset any variables after the trigger expression evaluated to true:
select sum(price) from StockTickEvent output when OutputTriggerVar = true then set OutputTriggerVar = false
Expressions in the when and then may, for example, use variables, user defined functions or any of the built-in named properties that are described in the below list.
The following built-in properties are available for use:
Table 4.2. Built-In Properties for Use with Output When
| Built-In Property Name | Description |
|---|---|
| last_output_timestamp | Timestamp when the last output occurred for the statement; Initially set to time of statement creation |
| count_insert | Number of insert stream events |
| count_remove | Number of remove stream events |
The values provided by count_insert and count_remove are non-continues: The number returned for these properties may 'jump' up rather then count up by 1. The counts reset to zero upon output.
The following restrictions apply to expressions used in the output rate clause:
Event property names cannot be used in the output clause.
Aggregation functions cannot be used in the output clause.
The prev previous event function and the prior prior event function cannot be used in the output clause.
Remove stream events can also useful in conjunction with aggregation and the output clause: When the engine posts remove stream events for fully-aggregated queries, it presents the aggregation state before the expiring event leaves the data window. Your application can thus easily obtain a delta between the new aggregation value and the prior aggregation value.
The engine evaluates the having-clause at the granularity of the data posted by views. That is, if you utilize a time window and output every 10 events, the having clause applies to each individual event or events entering and leaving the time window (and not once per batch of 10 events).
The output clause interacts in two ways with the group by and having clauses. First, in the output every n events case, the number n refers to the number of events arriving into the group by clause. That is, if the group by clause outputs only 1 event per group, or if the arriving events don't satisfy the having clause, then the actual number of events output by the statement could be fewer than n.
Second, the last and all keywords have special meanings when used in a statement with aggregate functions and the group by clause:
When no keyword is specified, the engine produces an output row for each row in the batch.
The all keyword (the default) specifies that the most recent data for all groups seen so far should be output, whether or not these groups' aggregate values have just been updated
The last keyword specifies that only groups whose aggregate values have been updated with the most recent batch of events should be output.
Please consult the Appendix A, Output Reference and Samples for detailed information on insert and remove stream output for aggregation and group-by.
By adding an output rate limiting clause to a statement that contains a group by clause we can control output of groups to obtain one row for each group, generating an event per group at the given output frequency:
select symbol, sum(price) from StockTickEvent group by symbol output all every 5 seconds
Output rate limiting provides output events to your application in regular intervals. Between intervals, the engine uses a buffer to hold events until the output condition is reached. If your application has high-volume streams, you may need to be mindful of the memory needs for output rates.
The output clause with the snapshot keyword does not require a buffer, all other output keywords do consume memory until the output condition is reached.
The order by clause is optional. It is used for ordering output events by their properties, or by expressions involving those properties. .
For example, the following statement outputs batches of 5 or more stock tick events that are sorted first by price ascending and then by volume ascending:
select symbol from StockTickEvent.win:time(60 sec) output every 5 events order by price, volume
Here is the syntax for the order by clause:
order by expression [asc | desc] [, expression [asc | desc]] [, ...]
If the order by clause is absent then the engine still makes certain guarantees about the ordering of output:
If the statement is not a join, does not group via group by clause and does not declare grouped data windows via std:groupby view, the order in which events are delivered to listeners and through the iterator pull API is the order of event arrival.
If the statement is a join or outer join, or groups, then the order in which events are delivered to listeners and through the iterator pull API is not well-defined. Use the order by clause if your application requires events to be delivered in a well-defined order.
Esper places the following restrictions on the expressions in the order by clause:
All aggregate functions that appear in the order by clause must also appear in the select expression.
Otherwise, any kind of expression that can appear in the select clause, as well as any name defined in the select clause, is also valid in the order by clause.
By default all sort operations on string values are performed via the compare method and are thus not locale dependent. To account for differences in language or locale, see Section 10.4.17, “Engine Settings related to Language and Locale” to change this setting.
The limit clause is typically used together with the order by and output clause to limit your query results to those that fall within a specified range. You can use it to receive the first given number of result rows, or to receive a range of result rows.
There are two syntaxes for the limit clause, each can be parameterized by integer constants or by variable names. The first syntax is shown below:
limit row_count [offset offset_count]
The required row_count parameter specifies the number of rows to output. The row_count can be an integer constant and can also be the name of the integer-type variable to evaluate at runtime.
The optional offset_count parameter specifies the number of rows that should be skipped (offset) at the beginning of the result set. A variable can also be used for this parameter.
The next sample EPL query outputs the top 10 counts per property 'uri' every 1 minute.
select uri, count(*) from WebEvent group by uri output snapshot every 1 minute order by count(*) desc limit 10
The next statement demonstrates the use of the offset keyword. It outputs ranks 3 to 10 per property 'uri' every 1 minute:
select uri, count(*) from WebEvent group by uri output snapshot every 1 minute order by count(*) desc limit 8 offset 2
The second syntax for the limit clause is for SQL standard compatibility and specifies the offset first, followed by the row count:
limit offset_count[, row_count]
The following are equivalent:
limit 8 offset 2 // ...equivalent to limit 2, 8
A negative value for row_count returns an unlimited number or rows, and a zero value returns no rows. If variables are used, then the current variable value at the time of output dictates the row count and offset. A variable returning a null value for row_count also returns an unlimited number or rows.
A negative value for offset is not allowed. If your variable returns a negative or null value for offset then the value is assumed to be zero (i.e. no offset).
The iterator pull API also honors the limit clause, if present.
The insert into clause is optional in Esper. The clause can be specified to make the results of a statement available as an event stream for use in further statements, or to insert events into a named window. The clause can also be used to merge multiple event streams to form a single stream of events.
The syntax for the insert into clause is as follows:
insert [istream | rstream] into event_stream_name [ (property_name [, property_name] ) ]
The istream (default) and rstream keywords are optional. If no keyword or the istream keyword is specified, the engine supplies the insert stream events generated by the statement. The insert stream consists of the events entering the respective window(s) or stream(s). If the rstream keyword is specified, the engine supplies the remove stream events generated by the statement. The remove stream consists of the events leaving the respective window(s).
The event_stream_name is an identifier that names the event stream (and also implicitly names the types of events in the stream) generated by the engine. The identifier can be used in further statements to filter and process events of that event stream. The insert into clause can consist of just an event stream name, or an event stream name and one or more property names.
The engine also allows listeners to be attached to a statement that contain an insert into clause. Listeners receive all events posted to the event stream.
To merge event streams, simply use the same event_stream_name identifier in all EPL statements that merge their result event streams. Make sure to use the same number and names of event properties and event property types match up.
Esper places the following restrictions on the insert into clause:
The number of elements in the select clause must match the number of elements in the insert into clause if the clause specifies a list of event property names
If the event stream name has already been defined by a prior statement or configuration, and the event property names and/or event types do not match, an exception is thrown at statement creation time.
The following sample inserts into an event stream by name CombinedEvent:
insert into CombinedEvent select A.customerId as custId, A.timestamp - B.timestamp as latency from EventA.win:time(30 min) A, EventB.win:time(30 min) B where A.txnId = B.txnId
Each event in the CombinedEvent event stream has two event properties named "custId" and "latency". The events generated by the above statement can be used in further statements, such as shown in the next statement:
select custId, sum(latency) from CombinedEvent.win:time(30 min) group by custId
The example statement below shows the alternative form of the insert into clause that explicitly defines the property names to use.
insert into CombinedEvent (custId, latency) select A.customerId, A.timestamp - B.timestamp ...
The rstream keyword can be useful to indicate to the engine to generate only remove stream events. This can be useful if we want to trigger actions when events leave a window rather then when events enter a window. The statement below generates CombinedEvent events when EventA and EventB leave the window after 30 minutes.
insert rstream into CombinedEvent select A.customerId as custId, A.timestamp - B.timestamp as latency from EventA.win:time(30 min) A, EventB.win:time(30 min) B where A.txnId = B.txnId
The insert into clause can be used in connection with patterns to provide pattern results to further statements for analysis:
insert into ReUpEvent select linkUp.ip as ip from pattern [every linkDown=LinkDownEvent -> linkUp=LinkUpEvent(ip=linkDown.ip)]
Sometimes your events may carry properties that are themselves event objects. Therefore EPL offers a special syntax to insert the value of a property itself as an event into a stream:
insert into stream_name select property_name.* from ...
This feature is only supported for JavaBean events and is not supported for Map or XML events. Nested property names are also not supported.
In this example, the class Summary with properties bid and ask that are of type Quote is:
public class Summary {
private Quote bid;
private Quote ask;
...The statement to populate a stream of Quote events is thus:
insert into MyBidStream select bid.* from Summary
The insert into clause allows to merge multiple event streams into a event single stream. The clause names an event stream to insert into by specifing an event_stream_name. The first statement that inserts into the named stream defines the stream's event types. Further statements that insert into the same event stream must match the type of events inserted into the stream as declared by the first statement.
One approach to merging event streams specifies individual colum names either in the select clause or in the insert into clause of the statement. This approach has been shown in earlier examples.
Another approach to merging event streams specifies the wildcard (*) in the select clause (or the stream wildcard) to select the underlying event. The events in the event stream must then have the same event type as generated by the from clause.
Assume a statement creates an event stream named MergedStream by selecting OrderEvent events:
insert into MergedStream select * from OrderEvent
A statement can use the stream wildcard selector to select only OrderEvent events in a join:
insert into MergedStream select ord.* from ItemScanEvent, OrderEvent as ord
And a statement may also use an application-supplied user-defined function to convert events to OrderEvent instances:
insert into MergedStream select MyLib.convert(item) from ItemScanEvent as item
Esper specifically recognizes a conversion function: A conversion function must be the only selected column, and it must return either a Java object or java.util.Map.
A variant stream is a predefined stream into which events of multiple disparate event types can be inserted.
A variant stream name may appear anywhere in a pattern or from clause. In a pattern, a filter against a variant stream matches any events of any of the event types inserted into the variant stream. In a from clause including for named windows, views declared onto a variant stream may hold events of any of the event types inserted into the variant stream.
A variant stream is thus useful in problems that require different types of event to be treated the same.
Variant streams are predefined via runtime or initialization-time configuration as described in Section 10.4.20, “Variant Stream”. Your application may predefine variant streams to carry events of a limited set of event types, or you may choose the variant stream to carry any and all types of events. This choice affects what event properties are available for consuming statements or patterns of the variant stream.
Assume that an application predefined a variant stream named OrderStream to carry only ServiceOrder and ProductOrder events. An insert into clause inserts events into the variant stream:
insert into OrderStream select * from ServiceOrder insert into OrderStream select * from ProductOrder
Here is a sample statement that consumes the variant stream and outputs a total price per customer id for the last 30 seconds of ServiceOrder and ProductOrder events:
select customerId, sum(price) from OrderStream.win:time(30 sec) group by customerId
If your application predefines the variant stream to hold specific type of events, as the sample above did, then all event properties that are common to all specified types are visible on the variant stream, including nested, indexed and mapped properties. For access to properties that are only available on one of the types, the dynamic property syntax must be used. In the example above, the customerId and price were properties common to both ServiceOrder and ProductOrder events.
For example, here is a consuming statement that selects a service duraction property that only ServiceOrder events have, and that must therefore be casted to double and null values removed in order to aggregate:
select customerId, sum(coalesce(cast(serviceDuraction?, double), 0)) from OrderStream.win:time(30 sec) group by customerId
If your application predefines a variant stream to hold any type of events (the any type variance), then all event properties of the variant stream are effectively dynamic properties.
For example, an application may define an OutgoingEvents variant stream to hold any type of event. The next statement is a sample consumer of the OutgoingEvents variant stream that looks for the destination property and fires for each event in which the property exists with a value of 'email':
select * from OutgoingEvents(destination = 'email')
Your select clause may use the '*' wildcard together with further expressions to populate a stream of events. A sample statement is:
insert into OrderStream select *, price*units as linePrice from PurchaseOrder
When using wildcard and selecting additional expression results, the engine produces what is called decorating events for the resulting stream. Decorating events add additional property values to an underlying event.
In the above example the resulting OrderStream consists of underlying PurchaseOrder events decorated by a linePrice property that is a result of the price*units expression.
In order to use insert into to insert into an existing stream of decorated events, your underlying event type must match, and all additional decorating property names and types of the select clause must also match.
Your select clause may use the stream name to populate a stream of events in which each event has properties that are itself an event. A sample statement is:
insert into CompositeStream select order, service, order.price+service.price as totalPrice from PurchaseOrder.std:lastevent() as order, ServiceEvent:std:lastevent() as service
When using the stream name (or tag in patterns) in the select clause, the engine produces composite events: One or more of the properties of the composite event are events themselves.
In the above example the resulting CompositeStream consists of 3 columns: the PurchaseOrder event, the ServiceEvent event and the totalPrice property that is a result of the order.price+service.price expression.
In order to use insert into to insert into an existing stream of events in which properties are themselves events, each event column's event type must match, and all additional property names and types of the select clause must also match.
Two or more event streams can be part of the from clause and thus both (all) streams determine the resulting events. The where clause lists the join conditions that Esper uses to relate events in the two or more streams. Reference and historical data such as stored in your relational database, and data returned by a method invocation, can also be included in joins. Please see Section 4.15, “Accessing Relational Data via SQL” and Section 4.16, “Accessing Non-Relational Data via Method Invocation” for details.
Each point in time that an event arrives to one of the event streams, the two event streams are joined and output events are produced according to the where clause.
This example joins 2 event streams. The first event stream consists of fraud warning events for which we keep the last 30 minutes. The second stream is withdrawal events for which we consider the last 30 seconds. The streams are joined on account number.
select fraud.accountNumber as accntNum, fraud.warning as warn, withdraw.amount as amount,
max(fraud.timestamp, withdraw.timestamp) as timestamp, 'withdrawlFraud' as desc
from com.espertech.esper.example.atm.FraudWarningEvent.win:time(30 min) as fraud,
com.espertech.esper.example.atm.WithdrawalEvent.win:time(30 sec) as withdraw
where fraud.accountNumber = withdraw.accountNumberJoins can also include one or more pattern statements as the next example shows:
select * from FraudWarningEvent.win:time(30 min) as fraud,
pattern [every w=WithdrawalEvent -> PINChangeEvent(acct=w.acct)].std:lastevent() as withdraw
where fraud.accountNumber = withdraw.w.accountNumberThe statement above joins the last 30 minutes of fraud warnings with a pattern. The pattern consists of every withdrawal event that is followed by a PIN change event for the same account number. It joins the two event streams on account number. The last-event view instucts the join to only consider the last pattern match.
In a join and outer join, your statement must declare a data window view or other view onto each stream. Streams that are marked as unidirectional and named windows as well as database or methods in a join are an exception and do not require a view to be specified. If you are joining an event to itself via contained-event selection, views also do not need to be specified.
The next example joins all FraudWarningEvent events that arrived since the statement was started, with the last 20 seconds of PINChangeEvent events:
select * from FraudWarningEvent.win:keepall() as fraud, PINChangeEvent.win:time(20 sec) as pin where fraud.accountNumber = pin.accountNumber
The above example employed the special keep-all view that retains all events.
Esper supports left outer joins, right outer joins, full outer joins and inner joins in any combination between an unlimited number of event streams. Outer and inner joins can also join reference and historical data as explained in Section 4.15, “Accessing Relational Data via SQL”, as well as join data returned by a method invocation as outlined in Section 4.16, “Accessing Non-Relational Data via Method Invocation”.
The keywords left, right, full and inner control the type of the join between two streams. The on clause specifies one or more properties that join each stream. The synopsis is as follows:
...from stream_def [as name] ((left|right|full outer) | inner) join stream_def on property = property [and property = property ...] [ ((left|right|full outer) | inner) join stream_def on ...]...
If the outer join is a left outer join, there will be an output event for each event of the stream on the left-hand side of the clause. For example, in the left outer join shown below we will get output for each event in the stream RfidEvent, even if the event does not match any event in the event stream OrderList.
select * from RfidEvent.win:time(30 sec) as rfid
left outer join
OrderList.win:length(10000) as orderlist
on rfid.itemId = orderList.itemIdSimilarly, if the join is a Right Outer Join, then there will be an output event for each event of the stream on the right-hand side of the clause. For example, in the right outer join shown below we will get output for each event in the stream OrderList, even if the event does not match any event in the event stream RfidEvent.
select * from RfidEvent.win:time(30 sec) as rfid
right outer join
OrderList.win:length(10000) as orderlist
on rfid.itemId = orderList.itemIdFor all types of outer joins, if the join condition is not met, the select list is computed with the event properties of the arrived event while all other event properties are considered to be null.
The next type of outer join is a full outer join. In a full outer join, each point in time that an event arrives to one of the event streams, one or more output events are produced. In the example below, when either an RfidEvent or an OrderList event arrive, one or more output event is produced. The next example shows a full outer join that joins on multiple properties:
select * from RfidEvent.win:time(30 sec) as rfid
full outer join
OrderList.win:length(10000) as orderlist
on rfid.itemId = orderList.itemId and rfid.assetId = orderList.assetIdThe last type of join is an inner join. In an inner join, the engine produces an output event for each event of the stream on the left-hand side that matches at least one event on the right hand side considering the join properties. For example, in the inner join shown below we will get output for each event in the RfidEvent stream that matches one or more events in the OrderList data window:
select * from RfidEvent.win:time(30 sec) as rfid
inner join
OrderList.win:length(10000) as orderlist
on rfid.itemId = orderList.itemId and rfid.assetId = orderList.assetIdPatterns as streams in a join follow this rule: If no data window view is declared for the pattern then the pattern stream retains the last match. Thus a pattern must have matched at least once for the last row to become available in a join. Multiple rows from a pattern stream may be retained by declaring a data window view onto a pattern using the pattern [...].view_specification syntax.
Finally, this example outer joins multiple streams. Here the RfidEvent stream is outer joined to both ProductName and LocationDescription via left outer join:
select * from RfidEvent.win:time(30 sec) as rfid
left outer join ProductName.win:keepall() as refprod
on rfid.productId = refprod.prodId
left outer join LocationDescription.win:keepall() as refdesc
on rfid.location = refdesc.locIdIn a join or outer join your statement lists multiple event streams, views and/or patterns in the from clause. As events arrive into the engine, each of the streams (views, patterns) provides insert and remove stream events. The engine evaluates each insert and remove stream event provided by each stream, and joins or outer joins each event against data window contents of each stream, and thus generates insert and remove stream join results.
The direction of the join execution depends on which stream or streams are currently providing an insert or remove stream event for executing the join. A join is thus multidirectional, or bidirectional when only two streams are joined. A join can be made unidirectional if your application does not want new results when events arrive on a given stream or streams.
The unidirectional keyword can be used in the from clause to identify a single stream that provides the events to execute the join. If the keyword is present for a stream, all other streams in the from clause become passive streams. When events arrive or leave a data window of a passive stream then the join does not generate join results.
For example, consider a use case that requires us to join stock tick events (TickEvent) and news events (NewsEvent). The unidirectional keyword allows to generate results only when TickEvent events arrive, and not when NewsEvent arrive or leave the 10-second time window:
select * from TickEvent unidirectional, NewsEvent.win:time(10 sec) where tick.symbol = news.symbol
Aggregation functions in a unidirectional join aggregate within the context of each unidirectional event evaluation and are not cumulative.
The count function in the next query returns, for each TickEvent, the number of matching NewEvent events:
select count(*) from TickEvent unidirectional, NewsEvent.win:time(10 sec) where tick.symbol = news.symbol
The following restrictions apply to unidirectional joins:
The unidirectional keyword can only be specified for a single stream in the from clause.
Receiving data from a unidirectional join via the pull API (iterator method) is not allowed. This is because the engine holds no state for the single stream that provides the events to execute the join.
The stream that declares the unidirectional keyword cannot declare a data window view or other view for that stream, since remove stream events are not processed for the single stream.
A subquery is a select within another statement. Esper supports subqueries in the select clause and in the where clause of EPL statements. Subqueries provide an alternative way to perform operations that would otherwise require complex joins. Subqueries can also make statements more readable then complex joins.
Esper supports both simple subqueries as well as correlated subqueries. In a simple subquery, the inner query is not correlated to the outer query. Here is an example simple subquery within a select clause:
select assetId, (select zone from ZoneClosed.std:lastevent) as lastClosed from RFIDEvent
If the inner query is dependent on the outer query, we will have a correlated subquery. An example of a correlated subquery is shown below. Notice the where clause in the inner query, where the condition involves a stream from the outer query:
select * from RfidEvent as RFID where 'Dock 1' = (select name from Zones.std:unique(zoneId) where zoneId = RFID.zoneId)
The example above shows a subquery in the where clause. The statement selects RFID events in which the zone name matches a string constant based on zone id. The statement uses the view std:unique to guarantee that only the last event per zone id is held from processing by the subquery.
The next example is a correlated subquery within a select clause. In this statement the select clause retrieves the zone name by means of a subquery against the Zones set of events correlated by zone id:
select zoneId, (select name from Zones.std:unique(zoneId) where zoneId = RFID.zoneId) as name from RFIDEvent
Note that when a simple or correlated subquery returns multiple rows, the engine returns a null value as the subquery result. To limit the number of events returned by a subquery consider using one of the views std:lastevent, std:unique and std:groupby.
The select clause of a subquery also allows wildcard selects, which return as an event property the underlying event object of the event type as defined in the from clause. An example:
select (select * from MarketData.std:lastevent()) as md from pattern [every timer:interval(10 sec)]
The output events to the statement above contain the underlying MarketData event in a property named "md". The statement populates the last MarketData event into a property named "md" every 10 seconds following the pattern definition, or populates a null value if no MarketData event has been encountered so far.
Aggregation functions may be used in the select clause of the subselect as this example outlines:
select * from MarketData where price > (select max(price) from MarketData(symbol='GOOG').std:lastevent())
As the sub-select expression is evaluated first, the query above actually never fires for the GOOG symbol, only for other symbols that have a price higher then the current maximum for GOOG. As a sidenote, the insert into clause can also be handy to compute aggregation results for use in multiple subqueries.
The following restrictions apply to subqueries:
The subquery stream definition must define a data window or other view to limit subquery results, reducing the number of events held for subquery execution
Subqueries can only consist of a select clause, a from clause and a where clause. The group by and having clauses, as well as joins, outer-joins and output rate limiting are not permitted within subqueries.
If using aggregation functions in a subquery, note these limitations:
None of the properties of the correlated stream(s) can be used within aggregation functions.
The properties of the subselect stream must all be within aggregation functions.
The where clause cannot be used to correlate between the subselect stream and the enclosing stream, since the engine would otherwise be forced to re-evaluate the aggregation considering all events in the subselect-stream data window, which would likely be a very expensive operation.
The order of evaluation of subqueries relative to the containing statement is guaranteed: If the containing statement and its subqueries are reacting to the same type of event, the subquery will receive the event first before the containing statement's clauses are evaluated. This behavior can be changed via configuration. The order of evaluation of subqueries is not guaranteed between subqueries.
Performance of your statement containing one or more subqueries principally depends on two parameters. First, if your subquery correlates one or more columns in the subquery stream with the enclosing statement's streams via equals '=', the engine automatically builds the appropriate indexes for fast row retrieval based on the key values correlated (joined). The second parameter is the number of rows found in the subquery stream and the complexity of the filter criteria (where clause), as each row in the subquery stream must evaluate against the where clause filter.
The exists condition is considered "to be met" if the subquery returns at least one row. The not exists condition is considered true if the subquery returns no rows.
The synopsis for the exists keyword is as follows:
exists (subquery)Let's take a look at a simple example. The following is an EPL statement that uses the exists condition:
select assetId from RFIDEvent as RFID where exists (select * from Asset.std:unique(assetId) where assetId = RFID.assetId)
This select statement will return all RFID events where there is at least one event in Assets unique by asset id with the same asset id.
The in subquery condition is true if the value of an expression matches one or more of the values returned by the subquery. Consequently, the not in condition is true if the value of an expression matches none of the values returned by the subquery.
The synopsis for the in keyword is as follows:
expression in (subquery)
The right-hand side subquery must return exactly one column.
The next statement demonstrates the use of the in subquery condition:
select assetId from RFIDEvent where zone in (select zone from ZoneUpdate(status = 'closed').win:time(10 min))
The above statement demonstrated the in subquery to select RFID events for which the zone status is in a closed state.
Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the in construct will be null, not false (or true for not-in). This is in accordance with SQL's normal rules for Boolean combinations of null values.
The any subquery condition is true if the expression returns true for one or more of the values returned by the subquery.
The synopsis for the any keyword is as follows:
expression operator any (subquery) expression operator some (subquery)
The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of any is "true" if any true result is obtained. The result is "false" if no true result is found (including the special case where the subquery returns no rows).
The operator can be any of the following values: =, !=, <>, <, <=, >, >=.
The some keyword is a synonym for any. The in construct is equivalent to = any.
The right-hand side subquery must return exactly one column.
The next statement demonstrates the use of the any subquery condition:
select * from ProductOrder as ord
where quantity < any
(select minimumQuantity from MinimumQuantity.win:keepall())The above query compares ProductOrder event's quantity value with all rows from the MinimumQuantity stream of events and returns only those ProductOrder events that have a quantity that is less then any of the minimum quantity values of the MinimumQuantity events.
Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the any construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values.
The all subquery condition is true if the expression returns true for all of the values returned by the subquery.
The synopsis for the all keyword is as follows:
expression operator all (subquery)
The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of all is "true" if all rows yield true (including the special case where the subquery returns no rows). The result is "false" if any false result is found. The result is null if the comparison does not return false for any row, and it returns null for at least one row.
The operator can be any of the following values: =, !=, <>, <, <=, >, >=.
The not in construct is equivalent to != all.
The right-hand side subquery must return exactly one column.
The next statement demonstrates the use of the all subquery condition:
select * from ProductOrder as ord
where quantity < all
(select minimumQuantity from MinimumQuantity.win:keepall())The above query compares ProductOrder event's quantity value with all rows from the MinimumQuantity stream of events and returns only those ProductOrder events that have a quantity that is less then all of the minimum quantity values of the MinimumQuantity events.
This chapter outlines how reference data and historical data that are stored in a relational database can be queried via SQL within EPL statements.
Esper can access via join and outer join as well as via iterator (poll) API all types of event streams to stored data. In order for such data sources to become accessible to Esper, some configuration is required. The Section 10.4.8, “Relational Database Access” explains the required configuration for database access in greater detail, and includes information on configuring a query result cache.
Esper does not parse of otherwise inspect your SQL query. Therefore your SQL can make use of any database-specific SQL language extensions or features that your database provides.
If you have enabled query result caching in your Esper database configuration, Esper retains SQL query results in cache following the configured cache eviction policy.
Also if you have enabled query result caching in your Esper database configuration and provide EPL where clause and/or on clause (outer join) expressions, then Esper builds indexes on the SQL query results to enable fast lookup. This is especially useful if your queries return a large number of rows. For building the proper indexes, Esper inspects the expression found in your EPL query where clause, if present. For outer joins, Esper also inspects your EPL query on clause. Esper analyzes the EPL on clause and where clause expressions, if present, looking for property comparison with or without logical AND-relationships between properties. When a SQL query returns rows for caching, Esper builds the appropriate index and lookup strategies for fast row matching against indexes.
Joins or outer joins in which only SQL statements or method invocations are listed in the from clause and no other event streams are termed passive joins. A passive join does not produce an insert or remove stream and therefore does not invoke statement listeners with results. A passive join can be iterated on (polled) using a statement's safeIterator and iterator methods.
There are no restrictions to the number of SQL statements or types of streams joined. The following restrictions currently apply:
Sub-views on an SQL query are not allowed; That is, one cannot create a time or length window on an SQL query. However one can use the insert into syntax to make join results available to a further statement.
Your database software must support JDBC prepared statements that provide statement meta data at compilation time. Most major databases provide this function. A workaround is available for databases that do not provide this function.
JDBC drivers must support the getMetadata feature. A workaround is available as below for JDBC drivers that don't support getting metadata.
The next sections assume basic knowledge of SQL (Structured Query Language).
To join an event stream against stored data, specify the sql keyword followed by the name of the database and a parameterized SQL query. The syntax to use in the from clause of an EPL statement is:
sql:database_name [" parameterized_sql_query "]
The engine uses the database_name identifier to obtain configuration information in order to establish a database connection, as well as settings that control connection creation and removal. Please see Section 10.4.8, “Relational Database Access” to configure an engine for database access.
Following the database name is the SQL query to execute. The SQL query can contain one or more substitution parameters. The SQL query string is placed in single brackets [ and ]. The SQL query can be placed in either single quotes (') or double quotes ("). The SQL query grammer is passed to your database software unchanged, allowing you to write any SQL query syntax that your database understands, including stored procedure calls.
Substitution parameters in the SQL query string take the form ${event_property_name}. The engine resolves event_property_name at statement execution time to the actual event property value supplied by the events in the joined event stream.
The engine determines the type of the SQL query output columns by means of the result set metadata that your database software returns for the statement. The actual query results are obtained via the getObject on java.sql.ResultSet.
The sample EPL statement below joins an event stream consisting of CustomerCallEvent events with the results of an SQL query against the database named MyCustomerDB and table Customer:
select custId, cust_name from CustomerCallEvent,
sql:MyCustomerDB [' select cust_name from Customer where cust_id = ${custId} ']The example above assumes that CustomerCallEvent supplies an event property named custId. The SQL query selects the customer name from the Customer table. The where clause in the SQL matches the Customer table column cust_id with the value of custId in each CustomerCallEvent event. The engine executes the SQL query for each new CustomerCallEvent encountered.
If the SQL query returns no rows for a given customer id, the engine generates no output event. Else the engine generates one output event for each row returned by the SQL query. An outer join as described in the next section can be used to control whether the engine should generate output events even when the SQL query returns no rows.
The next example adds a time window of 30 seconds to the event stream CustomerCallEvent. It also renames the selected properties to customerName and customerId to demonstrate how the naming of columns in an SQL query can be used in the select clause in the EPL query. And the example uses explicit stream names via the as keyword.
select customerId, customerName from
CustomerCallEvent.win:time(30 sec) as cce,
sql:MyCustomerDB ["select cust_id as customerId, cust_name as customerName from Customer
where cust_id = ${cce.custId}"] as cqAny window, such as the time window, generates insert stream (istream) events as events enter the window, and remove stream (rstream) events as events leave the window. The engine executes the given SQL query for each CustomerCallEvent in both the insert stream and the remove stream. As a performance optimization, the istream or rstream keywords in the select clause can be used to instruct the engine to only join insert stream or remove stream events, reducing the number of SQL query executions.
Consider using the EPL where clause to join the SQL query result to your event stream. Similar to EPL joins and outer-joins that join event streams or patterns, the EPL where clause provides join criteria between the SQL query results and the event stream (as a side note, an SQL where clause is a filter of rows executed by your database on your database server before returning SQL query results).
Esper analyzes the expression in the EPL where clause and outer-join on clause, if present, and builds the appropriate indexes from that information at runtime, to ensure fast matching of event stream events to SQL query results, even if your SQL query returns a large number of rows. Your applications must ensure to configure a cache for your database using Esper configuration, as such indexes are held with regular data in a cache. If you application does not enable caching of SQL query results, the engine does not build indexes on cached data.
The sample EPL statement below joins an event stream consisting of OrderEvent events with the results of an SQL query against the database named MyRefDB and table SymbolReference:
select symbol, symbolDesc from OrderEvent as orders, sql:MyRefDB ['select symbolDesc from SymbolReference'] as reference where reference.symbol = orders.symbol
Notice how the EPL where clause joins the OrderEvent stream to the SymbolReference table. In this example, the SQL query itself does not have a SQL where clause and therefore returns all rows from table SymbolReference.
If your application enables caching, the SQL query fires only at the arrival of the first OrderEvent event. When the second OrderEvent arrives, the join execution uses the cached query result. If the caching policy that you specified in the Esper database configuration evicts the SQL query result from cache, then the engine fires the SQL query again to obtain a new result and places the result in cache.
If SQL result caching is enabled and your EPL where clause, as show in the above example, provides the properties to join, then the engine indexes the SQL query results in cache and retains the index together with the query result in cache. Thus your application can benefit from high performance index-based lookups as long as the SQL query results are found in cache.
The SQL result caches operate on the level of all result rows for a given parameter set. For example, if your query returns 10 rows for a certain set of parameter values then the cache treats all 10 rows as a single entry keyed by the parameter values, and the expiry policy applies to all 10 rows and not to each individual row.
It is also possible to join multiple autonomous database systems in a single query, for example:
select symbol, symbolDesc from OrderEvent as orders, sql:My_Oracle_DB ['select symbolDesc from SymbolReference'] as reference, sql:My_MySQL_DB ['select orderList from orderHistory'] as history where reference.symbol = orders.symbol and history.symbol = orders.symbol
You can use outer joins to join data obtained from an SQL query and control when an event is produced. Use a left outer join, such as in the next statement, if you need an output event for each event regardless of whether or not the SQL query returns rows. If the SQL query returns no rows, the join result populates null values into the selected properties.
select custId, custName from
CustomerCallEvent as cce
left outer join
sql:MyCustomerDB ["select cust_id, cust_name as custName
from Customer where cust_id = ${cce.custId}"] as cq
on cce.custId = cq.cust_idThe statement above always generates at least one output event for each CustomerCallEvent, containing all columns selected by the SQL query, even if the SQL query does not return any rows. Note the on expression that is required for outer joins. The on acts as an additional filter to rows returned by the SQL query.
Pattern statements and SQL queries can also be applied together in useful ways. One such use is to poll or request data from a database at regular intervals or following the schedule of the crontab-like timer:at.
The next statement is an example that shows a pattern that fires every 5 seconds to query the NewOrder table for new orders:
insert into NewOrders select orderId, orderAmount from pattern [every timer:interval(5 sec)], sql:MyCustomerDB ['select orderId, orderAmount from NewOrders']
Usually your SQL query will take part in a join and thus be triggered by an event or pattern occurrence. Instead, your application may need to poll a SQL query and thus use Esper query execution and caching facilities and obtain event data and metadata.
Your EPL statement can specify an SQL statement without a join. Such a stand-alone SQL statement does not post new events, and may only be queried via the iterator poll API. Your EPL and SQL statement may still use variables.
The next statement assumes that a price_var variable has been declared. It selects from the relational database table named NewOrder all rows in which the price column is greater then the current value of the price_var EPL variable:
select * from sql:MyCustomerDB ['select * from NewOrder where ${price_var} > price']Use the iterator and safeIterator methods on EPStatement to obtain results. The statement does not post events to listeners, it is strictly passive in that sense.
The engine translates SQL queries into JDBC java.sql.PreparedStatement statements by replacing ${name} parameters with '?' placeholders. It obtains name and type of result columns from the compiled PreparedStatement meta data when the EPL statement is created.
The engine supplies parameters to the compiled statement via the setObject method on PreparedStatement. The engine uses the getObject method on the compiled statement PreparedStatement to obtain column values.
Certain JDBC database drivers are known to not return metadata for precompiled prepared SQL statements. This can be a problem as metadata is required by Esper. Esper obtains SQL result set metadata to validate an EPL statement and to provide column types for output events. JDBC drivers that do not provide metadata for precompiled SQL statements require a workaround. Such drivers do generally provide metadata for executed SQL statements, however do not provide the metadata for precompiled SQL statements.
Please consult the Chapter 10, Configuration for the configuration options available in relation to metadata retrieval.
To obtain metadata for an SQL statement, Esper can alternatively fire a SQL statement which returns the same column names and types as the actual SQL statement but without returning any rows. This kind of SQL statement is referred to as a sample statement in below workaround description. The engine can then use the sample SQL statement to retrieve metadata for the column names and types returned by the actual SQL statement.
Applications can provide a sample SQL statement to retrieve metadata via the metadatasql keyword:
sql:database_name ["parameterized_sql_query" metadatasql "sql_meta_query"]
The sql_meta_query must be an SQL statement that returns the same number of columns, the same type of columns and the same column names as the parameterized_sql_query, and does not return any rows.
Alternatively, applications can choose not to provide an explicit sample SQL statement. If the EPL statement does not use the metadatasql syntax, the engine applies lexical analysis to the SQL statement. From the lexical analysis Esper generates a sample SQL statement adding a restrictive clause "where 1=0" to the SQL statement.
Alternatively, you can add the following tag to the SQL statement: ${$ESPER-SAMPLE-WHERE}. If the tag exists in the SQL statement, the engine does not perform lexical analysis and simply replaces the tag with the SQL where clause "where 1=0". Therefore this workaround is applicable to SQL statements that cannot be correctly lexically analyzed. The SQL text after the placeholder is not part of the sample query. For example:
select mycol from sql:myDB [
'select mycol from mytesttable ${$ESPER-SAMPLE-WHERE} where ....'], ...If your parameterized_sql_query SQL query contains vendor-specific SQL syntax, generation of the metadata query may fail to produce a valid SQL statement. If you experience an SQL error while fetching metadata, use any of the above workarounds with the Oracle JDBC driver.
Your application may need to join data that originates from a web service, a distributed cache, an object-oriented database or simply data held in memory by your application. Esper accommodates this need by allowing a method invocation (or procedure call or function) in the from clause of a statement.
The results of such a method invocation in the from clause plays the same role as a relational database table in an inner and outer join in SQL. The role is thus dissimilar to the role of a user-defined function, which may occur in any expression such as in the select clause or the where clause. Both are backed by one or more static methods provided by your class library.
Esper can join and outer join an unlimited number and all types of event streams to the data returned by your method invocation. In addition, Esper can be configured to cache the data returned by your method invocations.
Joins or outer joins in which only SQL statements or method invocations are listed in the from clause and no other event streams are termed passive joins. A passive join does not produce an insert or remove stream and therefore does not invoke statement listeners with results. A passive join can be iterated on (polled) using a statement's safeIterator and iterator methods.
The following restrictions currently apply:
Sub-views on a method invocations are not allowed; That is, one cannot create a time or length window on a method invocation. However one can use the insert into syntax to make join results available to a further statement.
The syntax for a method invocation in the from clause of an EPL statement is:
method:class_name.method_name[(parameter_expressions)]
The method keyword denotes a method invocation. It is followed by a class name and a method name separated by a dot (.) character. If you have parameters to your method invocation, these are placed in round brackets after the method name. Any expression is allowed as a parameter, and individual parameter expressions are separated by a comma. Expressions may also use event properties of the joined stream.
In the sample join statement shown next, the method 'lookupAsset' provided by class 'MyLookupLib' returns one or more rows based on the asset id (a property of the AssetMoveEvent) that is passed to the method:
select * from AssetMoveEvent, method:MyLookupLib.lookupAsset(assetId)
The following statement demonstrates the use of the where clause to join events to the rows returned by a method invocation, which in this example does not take parameters:
select assetId, assetDesc from AssetMoveEvent as asset,
method:MyLookupLib.getAssetDescriptions() as desc
where asset.assetid = desc.assetidYour method invocation may return zero, one or many rows for each method invocation. If you have caching enabled through configuration, then Esper can avoid the method invocation and instead use cached results. Similar to SQL joins, Esper also indexes cached result rows such that join operations based on the where clause or outer-join on clause can be very efficient, especially if your method invocation returns a large number of rows.
If the time taken by method invocations is critical to your application, you may configure local caches as Section 10.4.6, “Cache Settings for From-Clause Method Invocations” describes.
Esper analyzes the expression in the EPL where clause and outer-join on clause, if present, and builds the appropriate indexes from that information at runtime, to ensure fast matching of event stream events to method invocation results, even if your method invocation returns a large number of rows. Your applications must ensure to configure a cache for your method invocation using Esper configuration, as such indexes are held with regular data in a cache. If you application does not enable caching of method invocation results, the engine does not build indexes on cached data.
Usually your method invocation will take part in a join and thus be triggered by an event or pattern occurrence. Instead, your application may need to poll a method invocation and thus use Esper query execution and caching facilities and obtain event data and metadata.
Your EPL statement can specify a method invocation in the from clause without a join. Such a stand-alone method invocation does not post new events, and may only be queried via the iterator poll API. Your EPL statement may still use variables.
The next statement assumes that a category_var variable has been declared. It polls the getAssetDescriptions method passing the current value of the category_var EPL variable:
select * from method:MyLookupLib.getAssetDescriptions(category_var)]
Use the iterator and safeIterator methods on EPStatement to obtain results. The statement does not post events to listeners, it is strictly passive in that sense.
Your application must provide a Java class that exposes a public static method. The method must accept the same number and type of parameters as listed in the parameter expression list.
If your method invocation returns either no row or only one row, then the return type of the method can be a Java class or a java.util.Map. If your method invocation can return more then one row, then the return type of the method must be an array of Java class or an array of Map.
If you are using a Java class or an array of Java class as the return type, then the class must adhere to JavaBean conventions: it must expose properties through getter methods.
If you are using java.util.Map as the return type or an array of Map, then the map should have String-type keys and object values (Map<String, Object>). When using Map as the return type, your application must provide a second method that returns property metadata, as the next section outlines.
Your application method must return either of the following:
A null value or an empty array to indicate an empty result (no rows).
A Java object or Map to indicate a one-row result, or an array that consists of a single Java object or Map.
An array of Java objects or Map instances to return multiple result rows.
As an example, consider the method 'getAssetDescriptions' provided by class 'MyLookupLib' as discussed earlier:
select assetId, assetDesc from AssetMoveEvent as asset,
method:com.mypackage.MyLookupLib.getAssetDescriptions() as desc
where asset.assetid = desc.assetidThe 'getAssetDescriptions' method may return multiple rows and is therefore declared to return an array of the class 'AssetDesc'. The class AssetDesc is a POJO class (not shown here):
public class MyLookupLib {
...
public static AssetDesc[] getAssetDescriptions() {
...
return new AssetDesc[] {...};
}The example above specifies the full Java class name of the class 'MyLookupLib' class in the EPL statement. The package name does not need to be part of the EPL if your application imports the package using the auto-import configuration through the API or XML, as outlined in Section 10.4.5, “Class and package imports”.
Your application may return java.util.Map or an array of Map from method invocations. If doing so, your application must provide metadata about each row: it must declare the property name and property type of each Map entry of a row. This information allows the engine to perform type checking of expressions used within the statement.
You declare the property names and types of each row by providing a method that returns property metadata. The metadata method must follow these conventions:
The method name providing the property metadata must have same method name appended by the literal Metadata.
The method must have an empty parameter list and must be declared public and static.
The method providing the metadata must return a Map of String property name keys and java.lang.Class property name types (Map<String, Class>).
In the following example, a class 'MyLookupLib' provides a method to return historical data based on asset id and asset code:
select assetId, location, x_coord, y_coord from AssetMoveEvent as asset,
method:com.mypackage.MyLookupLib.getAssetHistory(assetId, assetCode) as historyA sample implementation of the class 'MyLookupLib' is shown below.
public class MyLookupLib {
...
// For each column in a row, provide the property name and type
//
public static Map<String, Class> getAssetHistoryMetadata() {
Map<String, Class> propertyNames = new HashMap<String, Class>();
propertyNames.put("location", String.class);
propertyNames.put("x_coord", Integer.class);
propertyNames.put("y_coord", Integer.class);
return propertyNames;
}
...
// Lookup rows based on assetId and assetCode
//
public static Map<String, Object>[] getAssetHistory(String assetId, String assetCode) {
Map rows = new Map[2]; // this sample returns 2 rows
for (int i = 0; i < 2; i++) {
rows[i] = new HashMap();
rows[i].put("location", "somevalue");
rows[i].put("x_coord", 100);
// ... set more values for each row
}
return rows;
}In the example above, the 'getAssetHistoryMetadata' method provides the property metadata: the names and types of properties in each row. The engine calls this method once per statement to determine event typing information.
The 'getAssetHistory' method returns an array of Map objects that are two rows. The implementation shown above is a simple example. The parameters to the method are the assetId and assetCode properties of the AssetMoveEvent joined to the method. The engine calls this method for each insert and remove stream event in AssetMoveEvent.
To indicate that no rows are found in a join, your application method may return either a null value or an array of size zero.
A named window is a global data window that can take part in many statement queries, and that can be inserted-into and deleted-from by multiple statements. A named window holds events of the same type or supertype, unless used with a variant stream.
The create window clause declares a new named window. The named window starts up empty unless populated from an existing named window at time of creation. Events must be inserted into the named window using the insert into clause. Events can also be deleted from a named window via the on delete clause.
Events enter the named window by means of insert into clause of a select statement. Events leave a named window either because the expiry policy of the declared data window removes events from the named window, or through statements that use the on delete clause to explicitly delete from a named window.
To query a named window, simply use the window name in the from clause of your statement, including statements that contain subqueries, joins and outer-joins.
A named window may also decorate an event to preserve original events as described in Section 4.10.4, “Decorated Events” and Section 4.17.3.1, “Named Windows Holding Decorated Events”. In addition, columns of a named window are allowed to hold events themselves, as further explained in Section 4.10.5, “Event as a Property” and Section 4.17.3.2, “Named Windows Holding Events As Property”.
The create window statement creates a named window by specifying a window name and one or more data window views, as well as the type of event to hold in the named window.
There are two syntaxes for creating a named window: The first syntax allows to model a named window after an existing event type or an existing named window. The second syntax is similar to the SQL create-table syntax and provides a list of column names and column types.
A new named window starts up empty. It must be explicitly inserted into by one or more statements, as discussed below. A named window can also be populated at time of creation from an existing named window.
If your application stops or destroys the statement that creates the named window, any consuming statements no longer receive insert or remove stream events. The named window can also not be deleted from after it was stopped or destroyed.
The create window statement posts to listeners any events that are inserted into the named window as new data. The statement posts all deleted events or events that expire out of the data window to listeners as the remove stream (old data). The named window contents can also be iterated on via the pull API to obtain the current contents of a named window.
The benefit of modelling a named window after an existing event type is that event properties can be nested, indexed, mapped or other types that your event objects may provide as properties, including the type of the underlying event itself. Also, using the wildcard (*) operator means your EPL does not need to list each individual property explicitly.
The syntax for creating a named window by modelling the named window after an existing event type, is as follows:
create window window_name.view_specifications [as] [select list_of_properties from] event_type_or_windowname [insert [where filter_expression]]
The window_name you assign to the named window can be any identifier. The name should not already be in use as an event type or stream name.
The view_specifications are one or more data window views that define the expiry policy for removing events from the data window. Named windows must explicitly declare a data window view. This is required to ensure that the policy for retaining events in the data window is well defined. To keep all events, use the keep-all view: It indicates that the named window should keep all events and only remove events from the named window that are deleted via the on delete clause. The view specification can only list data window views, derived-value views are not allowed since these don't represent an expiry policy. Data window views are listed in Chapter 8, EPL Reference: Views. View parameterization and staggering are described in Section 4.4.3, “Specifying Views”.
The select clause and list_of_properties are optional. If present, they specify the column names and, implicitly by definition of the event type, the column types of events held by the named window. Expressions other then column names are not allowed in the select list of properties. Wildcards (*) and wildcards with additional properties can also be used.
The event_type_or_windowname is required if using the model-after syntax. It provides the name of the event type of events held in the data window, unless column names and types have been explicitly selected via select. The name of an (existing) other named window is also allowed here. Please find more details in Section 4.17.7, “Populating a Named Window from an Existing Named Window”.
Finally, the insert clause and optional filter_expression are used if the new named windows is modelled after an existing named window, and the data of the existing named window is to be populated, upon time of creation of the new window, from the existing named window. The optional filter_expression can be used to exclude events.
The next statement creates a named window 'AllOrdersNamedWindow' for which the expiry policy is simply to keep all events. Assume that the event type 'OrderMapEventType' has been configured. The named window is to hold events of type 'OrderMapEventType':
create window AllOrdersNamedWindow.win:keepall() as OrderMapEventType
The below sample statement demonstrates the select syntax. It defines a named window in which each row has the three properties 'symbol', 'volume' and 'price'. This named window actively removes events from the window that are older then 30 seconds.
create window OrdersTimeWindow.win:time(30 sec) as select symbol, volume, price from OrderEvent
In an alternate form, the as keyword can be used to rename columns, and constants may occur in the select clause as well:
create window OrdersTimeWindow.win:time(30 sec) as select symbol as sym, volume as vol, price, 1 as alertId from OrderEvent
The second syntax for creating a named window is by supplying column names and types:
create window window_name.view_specifications [as] (column_name column_type [,column_name column_type [,...])
The column_name is an identifier providing the event property name. The column_type is also required for each column. Valid column types are listed in Section 4.18.1, “Creating Variables: the Create Variable clause” and are the same as for variable types.
The next statement creates a named window:
create window SecurityEvent.win:time(30 sec)
(ipAddress string, userId String, numAttempts int)There is no syntax to drop or remove a named window.
The destroy method on the EPStatement that created the named window removes the named window. However the implicit event type associated with the named window remains active since further statements may continue to use that type. Therefore a named window of the same name can only be created again if the type information matches the prior declaration for a named window.
An on delete clause removes events from a named window. The clause can be used to remove all events, or only events that match certain criteria, or events that correlate with an arriving event or a pattern of arriving events.
The syntax for the on delete clause is as follows:
on event_type[(filter_criteria)] [as stream_name] delete from window_name [as stream_name] [where criteria_expression]
The event_type is the name of the type of events that trigger removal from the named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign an name for use in the where clause. Patterns can also be specified in the on clause as described in the next section.
The window_name is the name of the named window to delete events from. The as keyword is also available to assign a name to the named window.
The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the events to be removed from the named window. The criteria_expression may also simply filter for events in the named window to be removed from the named window.
The iterator of the EPStatement object representing the on delete clause can also be helpful: It returns the last batch of deleted events in response to the last triggering event, in any order, or null if the last triggering event did not remove any rows.
Let's look at a couple of examples. In the simplest form, this statement deletes all events from the named window 'AllOrdersNamedWindow' when any 'FlushOrderEvent' arrives:
on FlushOrderEvent delete from AllOrdersNamedWindow
This example adds a where clause to the example above. Upon arrival of a triggering 'ZeroVolumeEvent', the statement removes from the named window any orders that have a volume of zero or less:
on ZeroVolumeEvent delete from AllOrdersNamedWindow where volume <= 0
The next example shows a more complete use of the syntax, and correlates the triggering event with events held by the named window:
on NewOrderEvent(volume>0) as myNewOrders delete from AllOrdersNamedWindow as myNamedWindow where myNamedWindow.symbol = myNewOrders.symbol
In the above sample statement, only if a 'NewOrderEvent' event with a volume greater then zero arrives does the statement trigger. Upon triggering, all events in the named window that have the same value for the symbol property as the triggering 'NewOrderEvent' event are then removed from the named window. The statement also showcases the as keyword to assign a name for use in the where expression.
For correlated queries (as above) that correlate triggering events with events held by a named window, Esper internally creates efficient indexes to enable high performance removal of events especially from named windows that hold large numbers of events.
Your application can subscribe a listener to your on delete statements to determine removed events. The statement post any events that are deleted from a named window to all listeners attached to the statement as new data. Upon iteration, the statement provides the last deleted event, if any.
By means of patterns the on delete clause and on select clause (described below) can look for more complex conditions to occur, possibly involving multiple events or the passing of time. The syntax for on delete with a pattern expression is show next:
on pattern [pattern_expression] [as stream_name] delete from window_name [as stream_name] [where criteria_expression]
The pattern_expression is any pattern that matches zero or more arriving events. Tags can be used to name events in the pattern and can occur in the optional where clause to correlate to events to be removed from a named window.
In the next example the triggering pattern fires every 10 seconds. The effect is that every 10 seconds the statement removes from 'MyNamedWindow' all rows:
on pattern [every timer:interval(10 sec)] delete from MyNamedWindow
The following example shows the use of tags in a pattern:
on pattern [every ord=OrderEvent(volume>0) or every flush=FlushOrderEvent] delete from OrderWindow as win where ord.id = win.id or flush.id = win.id
The pattern above looks for OrderEvent events with a volume value greater then zero and tags such events as 'ord'. The pattern also looks for FlushOrderEvent events and tags such events as 'flush'. The where clause deletes from the 'OrderWindow' named window any events that match in the value of the 'id' property either of the arriving events.
The insert into clause inserts events into named windows. Your application must ensure that the column names and types match the declared column names and types of the named window to be inserted into.
In this example we first create a named window using some of the columns of an OrderEvent event type:
create window OrdersWindow.win:keepall() as select symbol, volume, price from OrderEvent
The insert into the named window selects individual columns to be inserted:
insert into OrdersWindow(symbol, volume, price) select name, count, price from FXOrderEvent // .. alternative form... insert into OrdersWindow select name as symbol, vol as volume, price from FXOrderEvent
Following above statement, the engine enters every FXOrderEvent arriving into the engine into the named window 'OrdersWindow'.
The following EPL creates a named window for an event type backed by a Java class, and inserts into the window any 'OrderEvent' where the symbol value is IBM:
create window OrdersWindow as com.mycompany.OrderEvent insert into OrdersWindow select * from com.mycompany.OrderEvent(symbol='IBM')
The last example adds one column named 'derivedPrice' to the 'OrderEvent' type by specifying a wildcard, and uses a user-defined function to populate the column:
create window OrdersWindow as select *, price as derivedPrice from OrderEvent insert into OrdersWindow select *, MyFunc.func(price, percent) as derivedPrice from OrderEvent
Event representations based on Java base classes or interfaces, and subclasses or implementing classes, are compatible as these statements show:
// create a named window for the base class create window OrdersWindow as select * from ProductBaseEvent // The ServiceProductEvent class subclasses the ProductBaseEvent class insert into OrdersWindow select * from ServiceProductEvent // The MerchandiseProductEvent class subclasses the ProductBaseEvent class insert into OrdersWindow select * from MerchandiseProductEvent
To avoid duplicate events stored in a named window, use a subquery to test whether an event already exists in the named window:
insert into OrdersWindow select * from ServiceProductEvent as spe where not exists (select * from OrdersWindow as win where win.id = spe.id)
A statement that removes events from a named window via the on delete clause and a statement that inserts events into a named window via the insert into can be combined to replace events in the named window, by creating the two statements in the order as indicated by the sample:
// create in this order on ServiceProductEvent as spe delete from OrdersWindow as win where win.id = spe.id insert into OrdersWindow select * from ServiceProductEvent
Decorated events hold an underlying event and add additional properties to the underlying event, as described further in Section 4.10.4, “Decorated Events”.
Here we create a named window that decorates OrderEvent events by adding an additional property named priceTotal to each OrderEvent. A matching insert into statement is also part of the sample:
create window OrdersWindow as select *, price as priceTotal from OrderEvent insert into OrdersWindow select *, price * unit as priceTotal from ServiceOrderEvent
The property type of the additional priceTotal column is the property type of the existing price property of OrderEvent.
Columns in a named window may also hold an event itself. More information on the insert into clause providing event columns is in Section 4.10.5, “Event as a Property”.
The next sample creates a named window that specifies two columns: A column that holds an OrderEvent, and a column by name priceTotal. A matching insert into statement is also part of the sample:
create window OrdersWindow as select this, price as priceTotal from OrderEvent insert into OrdersWindow select order, price * unit as priceTotal from ServiceOrderEvent as order
Note that the this proprerty must exist on the event and must return the event class itself (JavaBean events only). The property type of the additional priceTotal column is the property type of the existing price property.
A named window can be referred to by any statement in the from clause of the statement. Filter criteria can also be specified. Additional views may be used onto named windows however such views cannot include data window views.
A statement selecting all events from a named window 'AllOrdersNamedWindow' is shown next. The named window must first be created via the create window clause before use.
select * from AllOrdersNamedWindow
The statement as above simply receives the unfiltered insert and remove stream of the named window and reports that stream to its listeners. An iterator on such statement returns all events in the named window, if any.
The next statement derives an average price per symbol from all events posted by a named window:
select symbol, avg(price) from AllOrdersNamedWindow group by symbol
Your application may create a consuming statement such as above on an empty named window, or your application may create the above statement on an already filled named window. The engine provides correct results in either case: At the time of statement creation the Esper engine internally initializes the consuming statement from the current named window, also taking your declared filters into consideration. Thus, your statement deriving data from a named window does not start empty if the named window already holds one or more events. A consuming statement also sees the remove stream of an already populated named window, if any.
If you require a subset of the data in the named window, you can specify one or more filter expressions onto the named window as shown here:
select symbol, avg(price) from AllOrdersNamedWindow(sector='energy') group by symbol
By adding a filter to the named window, the aggregation and grouping as well as any views that may be declared onto to the named window receive a filtered insert and remove stream. The above statement thus outputs, continuously, the average price per symbol for all orders in the named window that belong to a certain sector.
A side note on variables in filters filtering events from named windows: The engine initializes consuming statements at statement creation time and changes aggregation state continuously as events arrive. If the filter criteria contain variables and variable values changes, then the engine does not re-evaluate or re-build aggregation state. In such a case you may want to place variables in the having clause which evaluates on already-built aggregation state.
The following example further declares a view into the named window. Such a view can be a plug-in view or one of the built-in views, but cannot be a data window view (with the exception of the group-by view which is allowed).
select * from AllOrdersNamedWindow(volume>0, price>0).mycompany:mypluginview()
Data window views cannot be used onto named windows since named windows post insert and remove streams for the events entering and leaving the named window, thus the expiry policy and batch behavior are well defined by the data window declared for the named window. For example, the following is not allowed and fails at time of statement creation:
// not a valid statement select * from AllOrdersNamedWindow.win:time(30 sec)
The on select clause performs a one-time, non-continuous query on a named window every time a triggering event arrives or a triggering pattern matches. The query can consider all events in the named window, or only events that match certain criteria, or events that correlate with an arriving event or a pattern of arriving events.
The syntax for the on select clause is as follows:
on event_type[(filter_criteria)] [as stream_name] [insert into insert_into_def] select select_list from window_name [as stream_name] [where criteria_expression] [group by grouping_expression_list] [having grouping_search_conditions] [order by order_by_expression_list]
The event_type is the name of the type of events that trigger the query against the named window. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign an stream name. Patterns can also be specified in the on clause, see the samples in Section 4.17.2.1, “Using Patterns in the On Delete Clause”.
The insert into clause works as described in Section 4.10, “Merging Streams and Continuous Insertion: the Insert Into Clause”. The select clause is described in Section 4.3, “Choosing Event Properties And Events: the Select Clause”. For all clauses the semantics are equivalent to a join operation: The properties of the triggering event or events are available in the select clause and all other clauses.
The window_name in the from clause is the name of the named window to select events from. The as keyword is also available to assign a stream name to the named window. The as keyword is helpful in conjunction with wildcard in the select clause to select named window events via the syntax select streamname.* .
The optional where clause contains a criteria_expression that correlates the arriving (triggering) event to the events to be considered from the named window. The criteria_expression may also simply filter for events in the named window to be considered by the query.
The group by clause, the having clause and the order by clause are all optional and work as described in earlier chapters.
The similarities and differences between an on select clause and a regular or outer join are as follows:
A join is evaluated when any of the streams participating in the join have new events (insert stream) or events leaving data windows (remove stream). A join is therefore bi-directional or multi-directional. However, the on select statement has one triggering event or pattern that causes the query to be evaluated and is thus uni-directional.
The query within the on select statement is not continuous: It executes only when a triggering event or pattern occurs. Aggregation and groups are computed anew considering the contents of the named window at the time the triggering event arrives.
The iterator of the EPStatement object representing the on select clause returns the last batch of selected events in response to the last triggering event, or null if the last triggering event did not select any rows.
For correlated queries that correlate triggering events with events held by a named window, Esper internally creates efficient indexes to enable high performance querying of events. It analyzes the where clause to build one or more indexes for fast lookup in the named window based on the properties of the triggering event.
The next statement demonstrates the concept. Upon arrival of a QueryEvent event the statement selects all events in the 'OrdersNamedWindow' named window:
on QueryEvent select win.* from OrdersNamedWindow as win
The engine executes the query on arrival of a triggering event, in this case a QueryEvent. It posts the query results to any listeners to the statement, in a single invocation, as the new data array. By prefixing the wildcard (*) selector with the stream name, the select clause returns only events of the named window and does not also return triggering events.
The where clause filters and correlates events in the named window with the triggering event, as shown next:
on QueryEvent(volume>0) as query select query.symbol, query.volume, win.symbol from OrdersNamedWindow as win where win.symbol = query.symbol
Upon arrival of a QueryEvent, if that event has a value for the volume property that is greater then zero, the engine executes the query. The query considers all events currently held by the 'OrdersNamedWindow' that match the symbol property value of the triggering QueryEvent event. The engine then posts query results to the statement's listeners.
Aggregation, grouping and ordering of results are possible as this example shows:
on QueryEvent as queryEvent select symbol, sum(volume) from OrdersNamedWindow as win group by symbol having volume > 0 order by symbol
The above statement outputs the total volume per symbol for those groups where the sum of the volume is greater then zero, ordered by symbol ascending. The engine computes and posts the output based on the current contents of the 'OrdersNamedWindow' named window considering all events in the named window, since the query does not have a where clause.
When using wildcard (*) to select from streams in an on-select clause, each stream, that is the the triggering stream and the selected-upon named window, are selected, similar to a join. Therefore your wildcard select returns two columns: the triggering event and the selection result event, for each row.
on QueryEvent as queryEvent select * from OrdersNamedWindow as win
The query above returns a queryEvent column and a win column for each event. If only a single stream's event is desired in the result, use select win.* instead.
The on insert clause is an on select clause as described in the prior chapter with the addition of an insert into clause.
Similar to the on select clause, the engine executes the query when a triggering event arrives. It then provides the query results as an event stream to further statements. It populates the event stream that is named in the insert into clause.
The statement below provides the query results to any consumers of the MyOrderStream, upon arrival of a QueryEvent event:
on QueryEvent as query insert into MyOrderStream select win.* from OrdersNamedWindow as win
Here is a sample consuming statement of the MyOrderStream. The statement further filters the events provided by the on insert statement by user id and reports a total of volume per symbol:
select symbol, sum(volume) from MyOrderStream(userId='user1') group by symbol
Your EPL statement may specify the name of an existing named window when creating a new named window, and may use the insert keyword to indicate that the new named window is to be populated from the events currently held by the existing named window.
For example, and assuming the named window OrdersNamedWindow already exists, this statement creates a new named window ScratchOrders and populates all orders in OrdersNamedWindow into the new named window:
create window ScratchOrders as OrdersNamedWindow insert
The where keyword is also available to perform filtering, for example:
create window ScratchBuyOrders as OrdersNamedWindow insert where side = 'buy'
As outlined in Section 2.9, “Updating and Versioning Events”, revision event types process updates or new versions of events held by a named window.
A revision event type is simply one or more existing event types whose events are related by event properties that provide same key values. The purpose of key values is to indicate that arriving events are related: An event amends, updates or adds properties to an earlier event that shares the same key values.
Revision event types can be useful in these situations:
Some of your events carry only partial information that is related to a prior event.
Events arrive that add additional properties or change existing properties of prior events.
Events may carry properties that have null values or properties that do no exist (for example events backed by Map or XML), and for such properties the earlier value must be used instead.
To better illustrate, consider a revision event type that represents events for creation and updates to user profiles. Lets assume the user profile creation events carry the user id and a full profile. The profile update events indicate only the user id and the individual properties that actually changed. The user id property shall serve as a key value relating profile creation events and update events.
A revision event type must be configured to instruct the engine which event types participate and what their key properties are. Configuration is described in Section 10.4.19, “Revision Event Type” and is not shown here.
Assume that an event type UserProfileRevisions has been configured to hold profile events, i.e. creation and update events related by user id. This statement creates a named window to hold the last 1 hour of current profiles per user id:
create window UserProfileWindow.win:time(1 hour) select * from UserProfileRevisions insert into UserProfileWindow select * from UserProfileCreation insert into UserProfileWindow select * from UserProfileUpdate
In revision event types, the term base event is used to describe events that are subject to update. Events that update, amend or add additional properties to base events are termed delta events. In the example, base events are profile creation events and delta events are profile update events.
Base events are expected to arrive before delta events. In the case where a delta event arrives and is not related by key value to a base event or a revision of the base event currently held by the named window the engine ignores the delta event. Thus, considering the example, profile update events for a user id that does not have an existing profile in the named window are not applied.
When a base or delta event arrives, the insert and remove stream output by the named window are the current and the prior version of the event. Let's come back to the example. As creation events arrive that are followed by update events or more creation events for the same user id, the engine posts the current version of the profile as insert stream (new data) and the prior version of the profile as remove stream (old data).
Base events are also implicitly delta events. That is, if multiple base events of the same key property values arrive, then each base event provides a new version. In the example, if multiple profile creation events arrive for the same user id then new versions of the current profile for that user id are output by the engine for each base event, as it does for delta events.
The expiry policy as specified by view definitions applies to each distinct key value, or multiple distinct key values for composite keys. An expiry policy re-evaluates when new versions arrive. In the example, user profile events expire from the time window when no creation or update event for a given user id has been received for 1 hour.
Several strategies are available for merging or overlaying events as the configuration chapter describes in greater detail.
Any of the Map, XML and JavaBean event representations as well as plug-in event representations may participate in a revision event type. For example, profile creation events could be JavaBean events, while profile update events could be java.util.Map events.
Delta events may also add properties to the revision event type. For example, one could add a new event type with security information to the revision event type and such security-related properties become available on the resulting revision event type.
The following restrictions apply to revision event types:
Nested properties are only supported for the JavaBean event representation. Nested properties are not individually versioned; they are instead versioned by the containing property.
Dynamic, indexed and mapped properties are only supported for nested properties and not as properties of the revision event type itself.
A variable is a scalar value that is available for use in all statements including patterns. Variables can be used in an expression anywhere in a statement as well as in the output clause for output rate limiting.
Variables must first be declared or configured before use, by defining each variable's type and name. Variables can be created via the create variable syntax or declared by configuration. Variables can be assigned new values by using the on set syntax or via the setVariableValue methods on EPRuntime. The EPRuntime also provides method to read variable values.
The engine guarantees consistency and atomicity of variable reads and writes on a statement-level (this is a soft guarantee, see below). Variables are optimized for fast read access and are also multithread-safe.
The create variable syntax creates a new variable by defining the variable type and name. In alternative to the syntax, variables can also be declared in the runtime and engine configuration options.
The synopsis for creating a variable is as follows:
create variable variable_type variable_name [ = assignment_expression ]
The variable_type can be any of the following:
variable_type : string | char | character | bool | boolean | byte | short | int | integer | long | double | float
The variable_name is an identifier that names the variable. The variable name should not already be in use by another variable.
The assignment_expression is optional. Without an assignment expression the initial value for the variable is null. If present, it supplies the initial value for the variable.
The EPStatement object of the create variable statement provides access to variable values. The pull API methods iterator and safeIterator return the current variable value. Listeners to the create variable statement subscribe to changes in variable value: the engine posts new and old value of the variable to all listeners when the variable value is updated by an on set statement.
The example below creates a variable that provides a threshold value. The name of the variable is var_threshold and its type is long. The variable's initial value is null as no other value has been assigned:
create variable long var_threshold
This statement creates an integer-type variable named var_output_rate and initializes it to the value ten (10):
create variable integer var_output_rate = 10
In addition to creating a variable via the create variable syntax, the runtime and engine configuration API also allows adding variables. The next code snippet illustrates the use of the runtime configuration API to create a string-typed variable:
epService.getEPAdministrator().getConfiguration()
.addVariable("myVar", String.class, "init value");The on set statement assigns a new value to one or more variables when a triggering event arrives or a triggering pattern occurs. Use the setVariableValue methods on EPRuntime to assign variable values programmatically.
The synopsis for setting variable values is:
on event_type[(filter_criteria)] [as stream_name] set variable_name = expression [, variable_name = expression [,...]]
The event_type is the name of the type of events that trigger the variable assignments. It is optionally followed by filter_criteria which are filter expressions to apply to arriving events. The optional as keyword can be used to assign an stream name. Patterns can also be specified in the on clause.
The comma-separated list of variable names and expressions set the value of one or more variables. All new variable values are applied atomically: the changes to variable values by the on set statement become visible to other statements all at the same time. No changes are visible to other processing threads until the on set statement completed processing, and at that time all changes become visible at once.
The EPStatement object provides access to variable values. The pull API methods iterator and safeIterator return the current variable values for each of the variables set by the statement. Listeners to the statement subscribe to changes in variable values: the engine posts new variable values of all variables to any listeners.
In the following example, a variable by name var_output_rate has been declared previously. When a NewOutputRateEvent event arrives, the variable is updated to a new value supplied by the event property 'rate':
on NewOutputRateEvent set var_output_rate = rate
The next example shows two variables that are updated when a ThresholdUpdateEvent arrives:
on ThresholdUpdateEvent as t
set var_threshold_lower = t.lower,
var_threshold_higher = t.higherThe sample statement shown next counts the number of pattern matches using a variable. The pattern looks for OrderEvent events that are followed by CancelEvent events for the same order id within 10 seconds of the OrderEvent:
on pattern[every a=OrderEvent -> (CancelEvent(orderId=a.orderId) where timer:within(10 sec))] set var_counter = var_counter + 1
A variable name can be used in any expression and can also occur in an output rate limiting clause. This section presents examples and discusses performance, consistency and atomicity attributes of variables.
The next statement assumes that a variable named 'var_threshold' was created to hold a total price threshold value. The statement outputs an event when the total price for a symbol is greater then the current threshold value:
select symbol, sum(price) from TickEvent group by symbol having sum(price) > var_threshold
In this example we use a variable to dynamically change the output rate on-the-fly. The variable 'var_output_rate' holds the current rate at which the statement posts a current count to listeners:
select count(*) from TickEvent output every var_output_rate seconds
Variables are optimized towards high read frequency and lower write frequency. Variable reads do not incur locking overhead (99% of the time) while variable writes do incur locking overhead.
The engine softly guarantees consistency and atomicity of variables when your statement executes in response to an event or timer invocation. Variables acquire a stable value (implemented by versioning) when your statement starts executing in response to an event or timer invocation, and variables do not change value during execution. When one or more variable values are updated via on set statements, the changes to all updated variables become visible to statements as one unit and only when the on set statement completes successfully.
The atomicity and consistency guarantee is a soft guarantee. If any of your application statements, in response to an event or timer invocation, execute for a time interval longer then 15 seconds (default interval length), then the engine may use current variable values after 15 seconds passed, rather then then-current variable values at the time the statement started executing in response to an event or timer invocation.
The length of the time interval that variable values are held stable for the duration of execution of a given statement is by default 15 seconds, but can be configured via engine default settings.
Contained-event selection is for use when an event contains properties that are themselves events. For example when application events are coarse-grained structures and you need to perform bulk operations on the rows of the property graph in an event.
Use the contained-event selection syntax in a filter expression such as in a pattern, from clause, subselect, on-select and on-delete. This section provides the synopsis and examples.
To review, in the from clause a contained_selection may appear after the event stream name and filter criteria, and before any view specifications.
The synopsis for contained_selection is as follows:
[select select_expressions from] property_expression [as property_alias] [where filter_expression]
The select clause and select_expressions are optional and may be used to select specific properties.
The property_expression is required and must be a valid property name that returns an event fragment, i.e. a property that can itself be represented as an event by the underlying event representation. Simple values such as integer or string are not fragments.
The property_alias can be provided to assign a name to the property expression.
The where clause and filter_expression is optional and may be used to filter out properties.
As an example event, consider a media order. A media order consists of order items as well as product descriptions. A media order event can be represented as an object graph (POJO event representation), or a structure of nested Maps (Map event representation) or a XML document (XML DOM or Axiom event representation) or other custom plug-in event representation.
To illustrate, a sample media order event in XML event representation is shown below. Also, a XML event type can optionally be strongly-typed with an explicit XML XSD schema that we don't show here. Note that Map and POJO representation can be considered equivalent for the purpose of this example.
Let us now assume that we have declared the event type MediaOrder as being represented by the root node <mediaorder> of such XML snip:
<mediaorder>
<orderId>PO200901</orderId>
<items>
<item>
<itemId>100001</itemId>
<productId>B001</productId>
<amount>10</amount>
<price>11.95</price>
</item>
</items>
<books>
<book>
<bookId>B001</bookId>
<author>Heinlein</author>
<review>
<reviewId>1</reviewId>
<comment>best book ever</comment>
</review>
</book>
<book>
<bookId>B002</bookId>
<author>Isaac Asimov</author>
</book>
</books>
</mediaorder>The next query utilizes the contained-event selection syntax to return each book:
select * from MediaOrder[books.book]
The result of the above query is one event per book. Output events contain only the book properties and not any of the mediaorder-level properties.
Note that, when using listeners, the engine delivers multiple results in one invocation of each listener. Therefore listeners to the above statement can expect a single invocation passing all book events within one media order event as an array.
To better illustrate the position of the contained-event selection syntax in a statement, consider the next two queries:
select * from MediaOrder(orderId='PO200901')[books.book]
The above query the returns each book only for media orders with a given order id. This query illustrates a contained-event selection and a view:
select count(*) from MediaOrder[books.book].std:unique(bookId)
The sample above counts each book unique by book id.
Contained-event selection can be staggered. When staggering multiple contained-event selections the staggered contained-event selection is relative to its parent.
This example demonstrates staggering contained-event selections by selecting each review of each book:
select * from MediaOrder[books.book][review]
Listeners to the query above receive a row for each review of each book. Output events contain only the review properties and not the book or media order properties.
The following is not valid:
// not valid select * from MediaOrder[books.book.review]
The book property in an indexed property (an array or collection) and thereby requires an index in order to determine which book to use. The expression books.book[1].review is valid and means all reviews of the second (index 1) book.
The contained-event selection syntax is part of the filter expression and may therefore occur in patterns and anywhere a filter expression is valid.
A pattern example is below. The example assumes that a Cancel event type has been defined that also has an orderId property:
select * from pattern [c=Cancel -> books=MediaOrder(orderId = c.orderId)[books.book] ]
When used in a pattern, a filter with a contained-event selection returns an array of events, similar to the match-until clause in patterns. The above statement returns, in the books property, an array of book events.
The optional select clause provides control over which fields are available in output events. The expressions in the select clause apply only to the properties available underneath the property in the from clause, and the properties of the enclosing event.
When no select is specified, only the properties underneath the selected property are available in output events.
In summary, the select clause may contain:
Any expressions, wherein properties are resolved relative to the property in the from clause.
Use the wildcard (*) to provide all properties that exist under the property in the from clause.
Use the property_alias.* syntax to provide all properties that exist under a property in the from clause.
The next query's select clause selects each review for each book, and the order id as well as the book id of each book:
select * from MediaOrder[select orderId, bookId from books.book][select * from review] // ... equivalent to ... select * from MediaOrder[select orderId, bookId from books.book][review]]
Listeners to the statement above receive an event for each review of each book. Each output event has all properties of the review row, and in addition the bookId of each book and the orderId of the order. Thus bookId and orderId are found in each result event, duplicated when there are multiple reviews per book and order.
The above query uses wildcard (*) to select all properties from reviews. As has been discussed as part of the select clause, the wildcard (*) and property_alias.* do not copy properties for performance reasons. The wildcard syntax instead specifies the underlying type, and additional properties are added onto that underlying type if required. Only one wildcard (*) and property_alias.* (unless used with a column rename) may therefore occur in the select clause list of expressions.
All the following queries produce an output event for each review of each book. The next sample queries illustrate the options available to control the fields of output events.
The output events produced by the next query have all properties of each review and no other properties available:
select * from MediaOrder[books.book][review]
The following query is not a valid query, since the order id and book id are not part of the contained-event selection:
// Invalid select clause: orderId and bookId not produced. select orderId, bookId from MediaOrder[books.book][review]
This query is valid. Note that output events carry only the orderId and bookId properties and no other data:
select orderId, bookId from MediaOrder[books.book][select orderId, bookId from review] //... equivalent to ... select * from MediaOrder[select orderId, bookId from books.book][review]
This variation produces output events that have all properties of each book and only reviewId and comment for each review:
select * from MediaOrder[select * from books.book][select reviewId, comment from review] // ... equivalent to ... select * from MediaOrder[books.book as book][select book.*, reviewId, comment from review]
The output events of the next EPL have all properties of the order and only bookId and reviewId for each review:
select * from MediaOrder[books.book as book]
[select mediaOrder.*, bookId, reviewId from review] as mediaOrderThis EPL produces output events with 3 columns: a column named mediaOrder that is the order itself, a column named book for each book and a column named review that holds each review:
insert into ReviewStream select * from MediaOrder[books.book as book] [select mo.* as mediaOrder, book.* as book, review.* as review from review as review] as mo // .. and a sample consumer of ReviewStream... select mediaOrder.orderId, book.bookId, review.reviewId from ReviewStream
Please note these limitations:
Sub-selects, aggregation functions and the prev and prior operators are not available in contained-event selection.
Expressions in the select and where clause of a contained-event selection can only reference properties relative to the current event and property.
The optional where clause may be used to filter out properties at the same level that the where-clause occurs.
The properties in the filter expression must be relative to the property in the from clause or the enclosing event.
This query outputs all books with a given author:
select * from MediaOrder[books.book where author = 'Heinlein']
This query outputs each review of each book where a review comment contains the word 'good':
select * from MediaOrder[books.book][review where comment like 'good']
This section discusses contained-event selection in joins.
When joining within the same event it is not required that views are specified. Recall, in a join or outer join there must be views specified that hold the data to be joined. For self-joins, no views are required and the join executes against the data returned by the same event.
This query inner-joins items to books where book id matches the product id:
select book.bookId, item.itemId
from MediaOrder[books.book] as book,
MediaOrder[items.item] as item
where productId = bookIdQuery results for the above query when sending the media order event as shown earlier are:
| book.bookId | item.itemId |
|---|---|
| B001 | 100001 |
The next example query is a left outer join. It returns all books and their items, and for books without item it returns the book and a null value:
select book.bookId, item.itemId
from MediaOrder[books.book] as book
left outer join
MediaOrder[items.item] as item
on productId = bookIdQuery results for the above query when sending the media order event as shown earlier are:
| book.bookId | item.itemId |
|---|---|
| B001 | 100001 |
| B002 | null |
A full outer join combines the results of both left and right outer joins. The joined table will contain all records from both tables, and fill in null values for missing matches on either side.
This example query is a full outer join, returning all books as well as all items, and filling in null values for book id or item id if no match is found:
select orderId, book.bookId,item.itemId
from MediaOrder[books.book] as book
full outer join
MediaOrder[select orderId, * from items.item] as item
on productId = bookId
order by bookId, item.itemId asc
As in all other continuous queries, aggregation results are cumulative from the time the statement was created.
The following query counts the cumulative number of items in which the product id matches a book id:
select count(*)
from MediaOrder[books.book] as book,
MediaOrder[items.item] as item
where productId = bookIdThe unidirectional keyword in a join indicates to the query engine that aggregation state is not cumulative. The next query counts the number of items in which the product id matches a book id for each event:
select count(*)
from MediaOrder[books.book] as book unidirectional,
MediaOrder[items.item] as item
where productId = bookIdEvent patterns match when an event or multiple events occur that match the pattern's definition. Patterns can also be time-based.
Pattern expressions consist of pattern atoms and pattern operators:
Pattern atoms are the basic building blocks of patterns. Atoms are filter expressions, observers for time-based events and plug-in custom observers that observe external events not under the control of the engine.
Pattern operators control expression lifecycle and combine atoms logically or temporally.
The below table outlines the different pattern atoms available:
Table 5.1. Pattern Atoms
| Pattern Atom | Example |
|---|---|
| Filter expressions specify an event to look for. | StockTick(symbol='ABC', price > 100) |
| Time-based event observers specify time intervals or time schedules. | timer:interval(10 seconds) timer:at(*, 16, *, *, *) |
| Custom plug-in observers can add pattern language syntax for observing application-specific events. | myapplication:myobserver("http://someResource") |
There are 4 types of pattern operators:
Operators that control pattern subexpression repetition: every, [num] and until
Logical operators: and, or, not
Temporal operators that operate on event order: -> (followed-by)
Guards are where-conditions that control the lifecycle of subexpressions. Examples are timer:within. Custom plug-in guards may also be used.
Pattern expressions can be nested arbitrarily deep by including the nested expression(s) in () round parenthesis.
Underlying the pattern matching is a state machine that transitions between states based on arriving events and based on time advancing. A single event or advancing time may cause a reaction in multiple parts of your active pattern state.
This is an example pattern expression that matches on every ServiceMeasurement events in which the value of the latency event property is over 20 seconds, and on every ServiceMeasurement event in which the success property is false. Either one or the other condition must be true for this pattern to match.
every (spike=ServiceMeasurement(latency>20000) or error=ServiceMeasurement(success=false))
In the example above, the pattern expression starts with an every operator to indicate that the pattern should fire for every matching events and not just the first matching event. Within the every operator in round brackets is a nested pattern expression using the or operator. The left hand of the or operator is a filter expression that filters for events with a high latency value. The right hand of the operator contains a filter expression that filters for events with error status. Filter expressions are explained in Section 5.4, “Filter Expressions In Patterns”.
The example above assigned the tags spike and error to the events in the pattern. The tags are important since the engine only places tagged events into the output event(s) that a pattern generates, and that the engine supplies to listeners of the pattern statement. The tags can further be selected in the select-clause of an EPL statement as discussed in Section 4.4.2, “Pattern-based Event Streams”.
Patterns can also contain comments within the pattern as outlined in Section 4.2.2, “Using Comments”.
Pattern statements are created via the EPAdministrator interface. The EPAdministrator interface allows to create pattern statements in two ways: Pattern statements that want to make use of the EPL select clause or any other EPL constructs use the createEPL method to create a statement that specifies one or more pattern expressions. EPL statements that use patterns are described in more detail in Section 4.4.2, “Pattern-based Event Streams”. Use the syntax as shown in below example.
EPAdministrator admin = EPServiceProviderManager.getDefaultProvider().getEPAdministrator();
String eventName = ServiceMeasurement.class.getName();
EPStatement myTrigger = admin.createEPL("select * from pattern [" +
"every (spike=" + eventName + "(latency>20000) or error=" + eventName + "(success=false))]");
Pattern statements that do not need to make use of the EPL select clause or any other EPL constructs can use the createPattern method, as in below example.
EPStatement myTrigger = admin.createPattern( "every (spike=" + eventName + "(latency>20000) or error=" + eventName + "(success=false))");
A pattern may appear anywhere in the from clause of an EPL statement including joins and subqueries. Patterns may therefore be used in combination with the where clause, group by clause, having clause as well as output rate limiting and insert into. In addition, a data window view can be declared onto a pattern.
This example statement demonstrates the idea by selecting a total price per customer over pairs of events (ServiceOrder followed by a ProductOrder event for the same customer id within 1 minute), occuring in the last 2 hours, in which the sum of price is greater then 100, and using a where clause to filter on name:
select a.custId, sum(a.price + b.price)
from pattern [every a=ServiceOrder ->
b=ProductOrder(custId = a.custId) where timer:within(1 min)].win:time(2 hour)
where a.name in ('Repair', b.name)
group by a.custId
having sum(a.price + b.price) > 100When a pattern fires it publishes one or more events to any listeners to the pattern statement. The listener interface is the com.espertech.esper.client.UpdateListener interface.
The example below shows an anonymous implementation of the com.espertech.esper.client.UpdateListener interface. We add the anonymous listener implementation to the myPattern statement created earlier. The listener code simply extracts the underlying event class.
myPattern.addListener(new UpdateListener() {
public void update(EventBean[] newEvents, EventBean[] oldEvents) {
ServiceMeasurement spike = (ServiceMeasurement) newEvents[0].get("spike");
ServiceMeasurement error = (ServiceMeasurement) newEvents[0].get("error");
... // either spike or error can be null, depending on which occurred
... // add more logic here
}
});Listeners receive an array of EventBean instances in the newEvents parameter. There is one EventBean instance passed to the listener for each combination of events that matches the pattern expression. At least one EventBean instance is always passed to the listener.
The properties of each EventBean instance contain the underlying events that caused the pattern to fire, if events have been named in the filter expression via the name=eventType syntax. The property name is thus the name supplied in the pattern expression, while the property type is the type of the underlying class, in this example ServiceMeasurement.
Data can also be obtained from pattern statements via the safeIterator() and iterator() methods on EPStatement (the pull API). If the pattern had fired at least once, then the iterator returns the last event for which it fired. The hasNext() method can be used to determine if the pattern had fired.
if (myPattern.iterator().hasNext()) {
ServiceMeasurement event = (ServiceMeasurement) view.iterator().next().get("alert");
... // some more code here to process the event
}
else {
... // no matching events at this time
}Further, if a data window is defined onto a pattern, the iterator returns the pattern matches according to the data window expiry policy.
This pattern specifies a length window of 10 elements that retains the last 10 matches of A and B events, for use via iterator or for use in a join or subquery:
select * from pattern [every (A or B).win:length(10)
The operators at the top of this table take precedence over operators lower on the table.
Table 5.2. Pattern Operator Precedence
| Precedence | Operator | Description | Example |
|---|---|---|---|
| 1 | unary | every, not | every MyEvent timer:interval(5 min) and not MyEvent |
| 2 | guard postfix | where timer:within (or plug-in pattern guard) | MyEvent where timer:within(1 sec) |
| 3 | repeat | [num], until | [5] MyEvent [1..3] MyEvent until MyOtherEvent |
| 4 | and | and | every (MyEvent and MyOtherEvent) |
| 5 | or | or | every (MyEvent or MyOtherEvent) |
| 6 | followed-by | -> | every (MyEvent -> MyOtherEvent) |
If you are not sure about the precedence, please consider placing parenthesis () around your subexpressions. Parenthesis can also help make expressions easier to read and understand.
Note that we are also providing the EPL grammar as a HTML file as part of the documentation set on the project website.
The following table outlines sample equivalent expressions, with and without the use of parenthesis for subexpressions.
Table 5.3. Equivalent Pattern Expressions
| Expression | Equivalent | Reason |
|---|---|---|
| every A or B | (every A) or B | The every operator has higher precedence then the or operator. |
| every A -> B or C | (every A) -> (B or C) | The or operator has higher precedence then the followed-by operator. |
| A and B or C | (A and B) or C | The and operator has higher precedence then the or operator. |
| every A where timer:within(5) | every (A where timer:within(5)) | The every operator has higher precedence then the timer:within guard postfix. |
| A -> B until C -> D | A -> (B until C) -> D | The until operator has higher precedence then the followed-by operator. |
| [5] A or B | ([5] A) or B | The [num] repeat operator has higher precedence then the or operator. |
The simplest form of filter is a filter for events of a given type without any conditions on the event property values. This filter matches any event of that type regardless of the event's properties. The example below is such a filter. Note that this event pattern would stop firing as soon as the first RfidEvent is encountered.
com.mypackage.myevents.RfidEvent
To make the event pattern fire for every RfidEvent and not just the first event, use the every keyword.
every com.mypackage.myevents.RfidEvent
The example above specifies the fully-qualified Java class name as the event type. Via configuration, the event pattern above can be simplified by using the name that has been defined for the event type.
every RfidEvent
Interfaces and superclasses are also supported as event types. In the below example IRfidReadable is an interface class, and the statement matches any event that implements this interface:
every org.myorg.rfid.IRfidReadable
The filtering criteria to filter for events with certain event property values are placed within parenthesis after the event type name:
RfidEvent(category="Perishable")
All expressions can be used in filters, including static method invocations that return a boolean value:
RfidEvent(com.mycompany.MyRFIDLib.isInRange(x, y) or (x<0 and y < 0))
Filter expressions can be separated via a single comma ','. The comma represents a logical AND between expressions:
RfidEvent(zone=1, category=10) ...is equivalent to... RfidEvent(zone=1 and category=10)
The following set of operators are highly optimized through indexing and are the preferred means of filtering high-volume event streams:
equals =
not equals !=
comparison operators < , > , >=, <=
ranges
use the between keyword for a closed range where both endpoints are included
use the in keyword and round () or square brackets [] to control how endpoints are included
for inverted ranges use the not keyword and the between or in keywords
list-of-values checks using the in keyword or the not in keywords followed by a comma-separated list of values
At compile time as well as at run time, the engine scans new filter expressions for subexpressions that can be indexed. Indexing filter values to match event properties of incoming events enables the engine to match incoming events faster. The above list of operators represents the set of operators that the engine can best convert into indexes. The use of comma or logical and in filter expressions does not impact optimizations by the engine.
For more information on filters please see Section 4.4.1, “Filter-based Event Streams”. Contained-event selection on filters in patterns is further described in Section 4.19, “Contained-Event Selection”.
Filter criteria can also refer to events matching prior named events in the same expression. Below pattern is an example in which the pattern matches once for every RfidEvent that is preceded by an RfidEvent with the same asset id.
every e1=RfidEvent -> e2=RfidEvent(assetId=e1.assetId)
The syntax shown above allows filter criteria to reference prior results by specifying the event name tag of the prior event, and the event property name. The tag names in the above example were e1 and e2. This syntax can be used in all filter operators or expressions including ranges and the in set-of-values check:
every e1=RfidEvent -> e2=RfidEvent(MyLib.isInRadius(e1.x, e1.y, x, y) and zone in (1, e1.zone))
An arriving event changes the truth value of all expressions that look for the event. Consider the pattern as follows:
every (RfidEvent(zone > 1) and RfidEvent(zone < 10))
The pattern above is satisfied as soon as only one event with zone in the interval [2, 9] is received.
The every operator indicates that the pattern subexpression should restart when the subexpression qualified by the every keyword evaluates to true or false. Without the every operator the pattern subexpression stops when the pattern subexpression evaluates to true or false.
As a side note, please be aware that a single invocation to the UpdateListener interface may deliver multiple events in one invocation, since the interface accepts an array of values.
Thus the every operator works like a factory for the pattern subexpression contained within. When the pattern subexpression within it fires and thus quits checking for events, the every causes the start of a new pattern subexpression listening for more occurrences of the same event or set of events.
Every time a pattern subexpression within an every operator turns true the engine starts a new active subexpression looking for more event(s) or timing conditions that match the pattern subexpression. If the every operator is not specified for a subexpression, the subexpression stops after the first match was found.
This pattern fires when encountering an A event and then stops looking.
A
This pattern keeps firing when encountering A events, and doesn't stop looking.
every A
When using every operator with the -> followed-by operator, each time the every operator restarts it also starts a new subexpression instance looking for events in the followed-by subexpression.
Let's consider an example event sequence as follows.
A1 B1 C1 B2 A2 D1 A3 B3 E1 A4 F1 B4
Table 5.4. 'Every' operator examples
| Example | Description |
|---|---|
every ( A -> B ) | Detect an A event followed by a B event. At the time when B occurs the pattern matches, then the pattern matcher restarts and looks for the next A event.
|
every A -> B | The pattern fires for every A event followed by a B event.
|
A -> every B | The pattern fires for an A event followed by every B event.
|
every A -> every B | The pattern fires for every A event followed by every B event.
|
The examples show that it is possible that a pattern fires for multiple combinations of events that match a pattern expression. Each combination is posted as an EventBean instance to the update method in the UpdateListener implementation.
Let's consider the every operator in conjunction with a subexpression that matches 3 events that follow each other:
every (A -> B -> C)
The pattern first looks for A events. When an A event arrives, it looks for a B event. After the B event arrives, the pattern looks for a C event. Finally, when the C event arrives the pattern fires. The engine then starts looking for an A event again.
Assume that between the B event and the C event a second A2 event arrives. The pattern would ignore the A2 event entirely since it's then looking for a C event. As observed in the prior example, the every operator restarts the subexpression A -> B -> C only when the subexpression fires.
In the next statement the every operator applies only to the A event, not the whole subexpression:
every A -> B -> C
This pattern now matches for each A event that is followed by a B event and then a C event, regardless of when the A event arrives. Note that for each A event that arrives the pattern engine starts a new subexpression looking for a B event and then a C event, outputting each combination of matching events.
As the introduction of the every operator states, the operator starts new subexpression instances and can cause multiple matches to occur for a single arriving event.
New subexpressions also take a very small amount of system resources and thereby your application should carefully consider when subexpressions must end when designing patterns. Use the timer:within construct and the and not constructs to end active subexpressions. The data window onto a pattern stream does not serve to limit pattern subexpression lifetime.
Lets look at a concrete example. Consider the following sequence of events arriving:
A1 A2 B1
This pattern matches on arrival of B1 and outputs two events (an array of length 2 if using a listener). The two events are the combinations {A1, B1} and {A2, B1}:
every a=A -> b=B
The and not operators are used to end an active subexpression.
The next pattern matches on arrival of B1 and outputs only the last A event which is the combination {A2, B1}:
every a=A -> (b=B and not A)
The and not operators cause the subexpression looking for {A1, B?} to end when A2 arrives.
Similarly, in the pattern below the engine starts a new subexpression looking for a B event every 1 second. After 5 seconds there are 5 subexpressions active looking for a B event and 5 matches occur at once if a B event arrives after 5 seconds.
every timer:interval(1 sec) -> b=B
Again the and not operators can end subexpressions that are not intended to match any longer:
every timer:interval(1 sec) -> (b=B and not timer:interval(1 sec)) // equivalent to every timer:interval(1 sec) -> (b=B where timer:within(1 sec))
In this example we consider a generic pattern in which the pattern must match for each A event followed by a B event and followed by a C event, in which both the B event and the C event must arrive within 1 hour of the A event. The first approach to the pattern is as follows:
every A -> (B -> C) where timer:within(1 hour)
Consider the following sequence of events arriving:
A1 A2 B1 C1 B2 C2
First, the pattern as above never stops looking for A events since the every operator instructs the pattern to keep looking for A events.
When A1 arrives, the pattern starts a new subexpression that keeps A1 in memory and looks for any B event. At the same time, it also keeps looking for more A events.
When A2 arrives, the pattern starts a new subexpression that keeps A2 in memory and looks for any B event. At the same time, it also keeps looking for more A events.
After the arrival of A2, there are 3 subexpressions active:
The first active subexpression with A1 in memory, looking for any B event.
The second active subexpression with A2 in memory, looking for any B event.
A third active subexpression, looking for the next A event.
In the pattern above, we have specified a 1-hour lifetime for subexpressions looking for B and C events. Thus, if no B and no C event arrive within 1 hour after A1, the first subexpression goes away. If no B and no C event arrive within 1 hour after A2, the second subexpression goes away. The third subexpression however stays around looking for more A events.
The pattern as shown above thus matches on arrival of C1 for combination {A1, B1, C1} and for combination {A2, B1, C1}, provided that B1 and C1 arrive within an hour of A1 and A2.
You may now ask how to match on {A1, B1, C1} and {A2, B2, C2} instead, since you may need to correlate on a given property.
The pattern as discussed above matches every A event followed by the first B event followed by the next C event, and doesn't specifically qualify the B or C events to look for based on the A event. To look for specific B and C events in relation to a given A event, the correlation must use one or more of the properties of the A event, such as the "id" property:
every a=A -> (B(id=a.id -> C(id=a.id)) where timer:within(1 hour)
The pattern as shown above thus matches on arrival of C1 for combination {A1, B1, C1} and on arrival of C2 for combination {A2, B2, C2}.
This example looks at temperature sensor events named Sample. The pattern detects when 3 sensor events indicate a temperature of more then 50 degrees uninterrupted within 90 seconds of the first event, considering events for the same sensor only.
every sample=Sample(temp > 50) -> ( (Sample(sensor=sample.sensor, temp > 50) and not Sample(sensor=sample.sensor, temp <= 50)) -> (Sample(sensor=sample.sensor, temp > 50) and not Sample(sensor=sample.sensor, temp <= 50)) ) where timer:within(90 seconds))
The pattern starts a new subexpression in the round braces after the first followed-by operator for each time a sensor indicated more then 50 degrees. Each subexpression then lives a maximum of 90 seconds. Each subexpression ends if a temperature of 50 degress or less is encountered for the same sensor. Only if 3 temperature events in a row indicate more then 50 degrees, and within 90 seconds of the first event, and for the same sensor, does this pattern fire.
The repeat operator fires when a pattern subexpression evaluates to true a given number of times. The synopsis is as follows:
[match_count] repeating_subexpr
The repeat operator is very similar to the every operator in that it restarts the repeating_subexpr pattern subexpression up to a given number of times.
match_count is a positive number that specifies how often the repeating_subexpr pattern subexpression must evaluate to true before the repeat expression itself evaluates to true.
For example, this pattern fires when the last of five A events arrives:
[5] A
Parenthesis must be used for nested pattern subexpressions. This pattern fires when the last of a total of any five A or B events arrives:
[5] (A or B)
Without parenthesis the pattern semantics change, according to the operator precedence described earlier. This pattern fires when the last of a total of five A events arrives or a single B event arrives, whichever happens first:
[5] A or B
Tags can be used to name events in filter expression of pattern subexpressions. The next pattern looks for an A event followed by a B event, and a second A event followed by a second B event. The output event provides indexed and array properties of the same name:
[2] (a=A -> b=B)
Using tags with repeat is further described in Section 5.5.3.6, “Tags and the Repeat Operator”.
Consider the following pattern that demonstrates the behavior when a pattern subexpression becomes permanently false:
[2] (A and not C)
In the case where a C event arrives before 2 A events arrive, the pattern above becomes permanently false.
Lets add an every operator to restart the pattern and thus keep matching for all pairs of A events that arrive without a C event in between each pair:
every [2] (A and not C)
The repeat until operator provides additional control over repeated matching.
The repeat until operator takes an optional range, a pattern subexpression to repeat, the until keyword and a second pattern subexpression that ends the repetition. The synopsis is as follows:
[range] repeated_pattern_expr until end_pattern_expr
Without a range, the engine matches the repeated_pattern_expr pattern subexpression until the end_pattern_expr evaluates to true, at which time the expression turns true.
An optional range can be used to indicate the minimum number of times that the repeated_pattern_expr pattern subexpression must become true.
The optional range can also specify a maximum number of times that repeated_pattern_expr pattern subexpression evaluates to true and retains tagged events. When this number is reached, the engine stops the repeated_pattern_expr pattern subexpression.
In the unbound repeat, without a range, the engine matches the repeated_pattern_expr pattern subexpression until the end_pattern_expr evaluates to true, at which time the expression turns true. The synopsis is:
repeated_pattern_expr until end_pattern_expr
This is a pattern that keeps looking for A events until a B event arrives:
A until B
Nested pattern subexpressions must be placed in parenthesis since the until operator has precedence over most operators. This example collects all A or B events for 10 seconds and places events received in indexed properties 'a' and 'b':
(a=A or b=B) until timer:interval(10 sec)
The synopsis for the optional range qualifier is:
[ [low_endpoint] .. [high_endpoint] ]
low_endpoint is an optional number that appears on the left of two dots (..), after which follows an optional high_endpoint number.
A range thus consists of a low_endpoint and a high_endpoint in square brackets and separated by dot (.) characters. Both endpoint values are optional but either one or both must be supplied. The low_endpoint can be omitted to denote a range that starts at zero. The high_endpoint can be omitted to denote an open-ended range.
Some examples for valid ranges might be:
[3..10] [..3] // range starts at zero [2..] // open-ended range
The low_endpoint, if specified, defines the minimum number of times that the repeated_pattern_expr pattern subexpression must become true in order for the expression to become true.
The high_endpoint, if specified, is the maximum number of times that the repeated_pattern_expr pattern subexpression becomes true. If the number is reached, the engine stops the repeated_pattern_expr pattern subexpression.
In all cases, only at the time that the end_pattern_expr pattern subexpression evaluates to true does the expression become true. If end_pattern_expr pattern subexpression evaluates to false, then the expression evaluates to false.
An open-ended range specifies only a low endpoint and not a high endpoint.
Consider the following pattern which requires at least three A events to match:
[3..] A until B
In the pattern above, if a B event arrives before 3 A events occurred, the expression ends and evaluates to false.
A high-endpoint range specifies only a high endpoint and not a low endpoint.
In this sample pattern the engine will be looking for a maximum of 3 A events. The expression turns true as soon as a single B event arrives regardless of the number of A events received:
[..3] A until B
The next pattern matches when a C or D event arrives, regardless of the number of A or B events that occurred:
[..3] (a=A or b=B) until (c=C or d=D)
In the pattern above, if more then 3 A or B events arrive, the pattern stops looking for additional A or B events. The 'a' and 'b' tags retain only the first 3 (combined) matches among A and B events. The output event contains these tagged events as indexed properties.
A bounded range specifies a low endpoint and a high endpoint.
The next pattern matches after at least one A event arrives upon the arrival of a single B event:
[1..3] a=A until B
If a B event arrives before the first A event, then the pattern does not match. Only the first 3 A events are returned by the pattern.
The tags assigned to events in filter subexpressions within a repeat operator are available for use in filter expressions and also in any EPL clause.
This sample pattern matches 2 A events followed by a B event. Note the filter on B events: only a B event that has a value for the "beta" property that equals any of the "id" property values of the two A events is considered:
[2] A -> B(beta in (a[0].id, a[1].id))
The next EPL statement returns pairs of A events:
select a, a[0], a[0].id, a[1], a[1].id from pattern [ every [2] a=A ]
The select clause of the statement above showcases different ways of accessing tagged events:
The tag itself can be used to select an array of underlying events. For example, the 'a' expression above returns an array of underlying events of event type A.
The tag as an indexed property returns the underlying event at that index. For instance, the 'a[0]' expression returns the first underlying A event, or null if no such A event was matched by the repeat operator.
The tag as a nested, indexed property returns a property of the underlying event at that index. For example, the 'a[1].id' expression returns the 'id' property value of the second A event, or null if no such second A event was matched by the repeat operator.
Similar to the Java && operator the and operator requires both nested pattern expressions to turn true before the whole expression turns true (a join pattern).
This pattern matches when both an A event and a B event arrive, at the time the last of the two events arrive:
A and B
This pattern matches on any sequence of an A event followed by a B event and then a C event followed by a D event, or a C event followed by a D and an A event followed by a B event:
(A -> B) and (C -> D)
Note that in an and pattern expression it is not possible to correlate events based on event property values. For example, this is an invalid pattern:
// This is NOT valid a=A and B(id = a.id)
The above expression is invalid as it relies on the order of arrival of events, however in an and expression the order of events is not specified and events fulfill an and condition in any order. The above expression can be changed to use the followed-by operator:
// This is valid a=A -> B(id = a.id) // another example using 'and'... a=A -> (B(id = a.id) and C(id = a.id))
Consider a pattern that looks for the same event:
A and A
The pattern above fires when a single A event arrives. The first arriving A event triggers a state transition in both the left and the right hand side expression.
In order to match after two A events arrive in any order, there are two options to express this pattern. The followed-by operator is one option and the repeat operator is the second option, as the next two patterns show:
A -> A // ... or ... [2] A
Similar to the Java “||” operator the or operator requires either one of the expressions to turn true before the whole expression turns true.
Look for either an A event or a B event. As always, A and B can itself be nested expressions as well.
A or B
Detect all stock ticks that are either above or below a threshold.
every (StockTick(symbol='IBM', price < 100) or StockTick(symbol='IBM', price > 105)
The not operator negates the truth value of an expression. Pattern expressions prefixed with not are automatically defaulted to true upon start, and turn permanently false when the expression within turns true.
The not operator is generally used in conjunction with the and operator or subexpressions as the below examples show.
This pattern matches only when an A event is encountered followed by a B event but only if no C event was encountered before either an A event and a B event, counting from the time the pattern is started:
(A -> B) and not C
Assume we'd like to detect when an A event is followed by a D event, without any B or C events between the A and D events:
A -> (D and not (B or C))
It may help your understanding to discuss a pattern that uses the or operator and the not operator together:
a=A -> (b=B or not C)
In the pattern above, when an A event arrives then the engine starts the subexpression B or not C. As soon as the subexpression starts, the not operator turns to true. The or expression turns true and thus your listener receives an invocation providing the A event in the property 'a'. The subexpression does not end and continues listening for B and C events. Upon arrival of a B event your listener receives a second invocation. If instead a C event arrives, the not turns permanently false however that does not affect the or operator (but would end an and operator).
The followed by -> operator specifies that first the left hand expression must turn true and only then is the right hand expression evaluated for matching events.
Look for an A event and if encountered, look for a B event. As always, A and B can itself be nested event pattern expressions.
A -> B
This is a pattern that fires when 2 status events indicating an error occur one after the other.
StatusEvent(status='ERROR') -> StatusEvent(status='ERROR')
Guards are where-conditions that control the lifecycle of subexpressions. Custom guard functions can also be used. The section Chapter 11, Extension and Plug-in outlines guard plug-in development in greater detail.
The pattern guard where-condition has no relationship to the EPL where clause that filters sets of events.
Take as an example the following pattern expression:
MyEvent where timer.within(10 sec)
In this pattern the timer:within guard controls the subexpression that is looking for MyEvent events. The guard terminates the subexpression looking for MyEvent events after 10 seconds after start of the pattern. Thus the pattern alerts only once when the first MyEvent event arrives within 10 seconds after start of the pattern.
The every keyword requires additional discussion since it also controls subexpression lifecycle. Let's add the every keyword to the example pattern:
every MyEvent where timer.within(10 sec)
The difference to the pattern without every is that each MyEvent event that arrives now starts a new subexpression, including a new guard, looking for a further MyEvent event. The result is that, when a MyEvent arrives within 10 seconds after pattern start, the pattern execution will look for the next MyEvent event to arrive within 10 seconds after the previous one.
By placing parentheses around the every keyword and its subexpression, we can have the every under the control of the guard:
(every MyEvent) where timer.within(10 sec)
In the pattern above, the guard terminates the subexpression looking for all MyEvent events after 10 seconds after start of the pattern. This pattern alerts for all MyEvent events arriving within 10 seconds after pattern start, and then stops.
Guards do not change the truth value of the subexpression of which the guard controls the lifecycle, and therefore do not cause a restart of the subexpression when used with the every operator. For example, the next pattern stops returning matches after 10 seconds unless a match occurred within 10 seconds after pattern start:
every ( (A and B) where timer.within(10 sec) )
The timer:within guard acts like a stopwatch. If the associated pattern expression does not turn true within the specified time period it is stopped and permanently false.
The timer:within guard takes a time period (see Section 4.2.1, “Specifying Time Periods”) or an expression providing a number of seconds as a parameter. The seconds interval expression may contain references to properties of prior events in the same pattern as well as variables and substitution parameters.
This pattern fires if an A event arrives within 5 seconds after statement creation.
A where timer:within (5 seconds)
This pattern fires for all A events that arrive within 5 seconds. After 5 seconds, this pattern stops matching even if more A events arrive.
(every A) where timer:within (5 seconds)
This pattern is similar to the first pattern but here every time A arrives within 5 seconds, the pattern begins looking for A for another 5 seconds. As long as A events arrive within 5 seconds after the last A, the pattern does not stop matching.
every (A where timer:within (5 sec))
This pattern matches for any one A or B event in the next 5 seconds.
( A or B ) where timer:within (5 sec)
This pattern matches for any 2 errors that happen 10 seconds within each other.
every (StatusEvent(status='ERROR') -> StatusEvent(status='ERROR') where timer:within (10 sec))
The following guards are equivalent:
timer:within(2 minutes 5 seconds) timer:within(125 sec) timer:within(125)
The timer:within guard may be parameterized by an expression that contains one or more references to properties of prior events in the same pattern.
As a simple example, this pattern matches every A event followed by a B event that arrives within delta seconds after the A event:
every a=A -> b=B where timer:within (a.delta seconds)
Herein A event is assumed to have a delta property that provides the number of seconds to wait for B events. Each arriving A event may have a different value for delta and the guard is therefore parameterized dynamically based on the prior A event received.
When multiple events accumulate, for example when using the match-until or repeat pattern elements, an index must be provided:
[2] a=A -> b=B where timer:within (a[0].delta + a[1].delta)
The above pattern matches after 2 A events arrive followed by a B event within a time interval after the A event that is defined by the sum of the delta properties of both A events.
Filter atoms have been described in section Section 5.4, “Filter Expressions In Patterns”.
Observers observe time-based events for which the thread-of-control originates by the engine timer or external timer event. Custom observers can also be developed that observe timer events or other engine-external application events such as a file-exists check. The section Chapter 11, Extension and Plug-in outlines observer plug-in development in greater detail.
The timer:interval observer waits for the defined time before the truth value of the observer turns true. The observer takes a time period (see Section 4.2.1, “Specifying Time Periods”) as a parameter, or an expression that returns the number of seconds.
The observer may be parameterized by an expression that contains one or more references to properties of prior events in the same pattern, or may also reference variables, substitution parameters or any other expression returning a numeric value.
After an A event arrived wait 10 seconds then indicate that the pattern matches.
A -> timer:interval(10 seconds)
The pattern below fires every 20 seconds.
every timer:interval(20 sec)
The next example pattern fires for every A event that is not followed by a B event within 60 seconds after the A event arrived. The B event must have the same "id" property value as the A event.
every a=A -> (timer:interval(60 sec) and not B(id=a.id))
Consider the next example, which assumes that the A event has a property waittime:
every a=A -> (timer:interval(a.waittime + 2) and not B(id=a.id))
In the above pattern the logic waits for 2 seconds plus the number of seconds provided by the value of the waittime property of the A event.
The timer:at observer is similar in function to the Unix “crontab” command. At a specified time the expression turns true. The at operator can also be made to pattern match at regular intervals by using an every operator in front of the timer:at operator.
The syntax is: timer:at (minutes, hours, days of month, months, days of week [, seconds]).
The value for seconds is optional. Each element allows wildcard * values. Ranges can be specified by means of lower bounds then a colon ‘:’ then the upper bound. The division operator */x can be used to specify that every xth value is valid. Combinations of these operators can be used by placing these into square brackets([]).
The timer:at observer may also be parameterized by an expression that contains one or more references to properties of prior events in the same pattern, or may also reference variables, substitution parameters or any other expression returning a numeric value. The frequency division operator */x and parameters lists within brackets([]) are an exception: they may only contain variables, substitution parameters or numeric values.
This expression pattern matches every 5 minutes past the hour.
every timer:at(5, *, *, *, *)
The below timer:at pattern matches every 15 minutes from 8am to 5:45pm (hours 8 to 17 at 0, 15, 30 and 45 minutes past the hour) on even numbered days of the month as well as on the first day of the month.
timer:at (*/15, 8:17, [*/2, 1], *, *)
The below table outlines the fields, valid values and keywords available for each field:
Table 5.5. Properties offered by sample statement aggregating price
| Field Name | Mandatory? | Allowed Values | Additional Keywords |
|---|---|---|---|
| Minutes | yes | 0 - 59 | |
| Hours | yes | 0 - 23 | |
| Days Of Month | yes | 1 - 31 | last, weekday, lastweekday |
| Months | yes | 1 - 12 | |
| Days Of Week | yes | 0 (Sunday) - 6 (Saturday) | last |
| Seconds | no | 0 - 59 |
The keyword last used in the days-of-month field means the last day of the month (current month). To specify the last day of another month, a value for the month field has to be provided. For example: timer:at(*, *, last,2,*) is the last day of February.
The last keyword in the day-of-week field by itself simply means Saturday. If used in the day-of-week field after another value, it means "the last xxx day of the month" - for example "5 last" means "the last friday of the month". So the last Friday of the current month will be: timer:at(*, *, *, *, 5 last). And the last Friday of June: timer:at(*, *, *, 6, 5 last).
The keyword weekday is used to specify the weekday (Monday-Friday) nearest the given day. Variant could include month like in: timer:at(*, *, 30 weekday, 9, *) which is Friday September 28th (no jump over month).
The keyword lastweekday is a combination of two parameters, the last and the weekday keywords. A typical example could be: timer:at(*, *, *, lastweekday, 9, *) which will define Friday September 28th (example year is 2007).
When using timer:at with the every operator the crontab-like timer computes the next time at which the timer should fire based on the specification and the current time. When using every, the current time is the time the timer fired or the statement start time if the timer has not fired once.
For example, this pattern fires every 1 minute starting at 1:00pm and ending at 1:59pm, every day:
every timer:at(*, 13, *, *, *)
Assume the above statement gets started at 1:05pm and 20 seconds. In such case the above pattern fires every 1 minute starting at 1:06pm and ending at 1:59pm for that day and 1:00pm to 1:59pm every following day.
To get the pattern to fire only once at 1pm every day, explicitly specify the minute to start. The pattern below fires every day at 1:00pm:
every timer:at(0, 13, *, *, *)
By specifying a second resolution the timer can be made to fire every second, for instance:
every timer:at(*, *, *, *, *, *)
Esper arithmetic and logical operator precedence follows Java standard arithmetic and logical operator precedence.
The below table outlines the logical and comparison operators available.
Table 6.2. Syntax and results of logical and comparison operators
| Operator | Description |
|---|---|
| NOT | Returns true if the following condition is false, returns false if it is true. |
| OR | Returns true if either component condition is true, returns false if both are false. |
| AND | Returns true if both component conditions are true, returns false if either is false. |
| =, !=, <, > <=, >=, | Comparison. |
The { and } curly braces are array definition operators following the Java array initialization syntax. Arrays can be useful to pass to user-defined functions or to select array data in a select clause.
Array definitions consist of zero or more expressions within curly braces. Any type of expression is allowed within array definitions including constants, arithmetic expressions or event properties. This is the syntax of an array definition:
{ [expression [,expression...]] }
Consider the next statement that returns an event property named actions. The engine populates the actions property as an array of java.lang.String values with a length of 2 elements. The first element of the array contains the observation property value and the second element the command property value of RFIDEvent events.
select {observation, command} as actions from RFIDEventThe engine determines the array type based on the types returned by the expressions in the array definiton. For example, if all expressions in the array definition return integer values then the type of the array is java.lang.Integer[]. If the types returned by all expressions are compatible number types, such as integer and double values, the engine coerces the array element values and returns a suitable type, java.lang.Double[] in this example. The type of the array returned is Object[] if the types of expressions cannot be coerced or return object values. Null values can also be used in an array definition.
Arrays can come in handy for use as parameters to user-defined functions:
select * from RFIDEvent where Filter.myFilter(zone, {1,2,3})The in keyword determines if a given value matches any value in a list. The syntax of the keyword is:
test_expression [not] in (expression [,expression...] )
The test_expression is any valid expression. The keyword is followed by a list of expressions to test for a match. The optional not keyword specifies that the result of the predicate be negated.
The result of an in expression is of type Boolean. If the value of test_expression is equal to any expression from the comma-separated list, the result value is true. Otherwise, the result value is false.
The next example shows how the in keyword can be applied to select certain command types of RFID events:
select * from RFIDEvent where command in ('OBSERVATION', 'SIGNAL')The statement is equivalent to:
select * from RFIDEvent where command = 'OBSERVATION' or command = 'SIGNAL'
Expression may also return an array, a java.util.Collection or a java.util.Map. Thus event properties that are lists, sets or maps may provide values to compare against test_expression.
All expressions must be of the same type or a compatible type to test_expression. The in keyword may coerce number values to compatible types. If expression returns an array, then the component type of the array must be compatible, unless the component type of the array is Object.
If expression returns an array of component type Object, the operation compares each element of the array, applying equals semantics.
If expression returns a Collection, the operation determines if the collection contains the value returned by test_expression, applying contains semantics.
If expression returns a Map, the operation determines if the map contains the key value returned by test_expression, applying containsKey semantics.
Constants, arrays, Collection and Map expressions or event properties can be used combined.
For example, and assuming a property named 'mySpecialCmdList' exists that contains a list of command strings:
select * from RFIDEvent where command in ( 'OBSERVATION', 'SIGNAL', mySpecialCmdList)
When using prepared statements and substitution parameters with the in keyword, make sure to retain the parenthesis. Substitution values may also be arrays, Collection and Map values:
test_expression [not] in (? [,?...] )Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the any construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values.
The between keyword specifies a range to test. The syntax of the keyword is:
test_expression [not] between begin_expression and end_expression
The test_expression is any valid expression and is the expression to test for in the range defined by begin_expression and end_expression. The not keyword specifies that the result of the predicate be negated.
The result of a between expression is of type Boolean. If the value of test_expression is greater then or equal to the value of begin_expression and less than or equal to the value of end_expression, the result is true.
The next example shows how the between keyword can be used to select events with a price between 55 and 60 (inclusive).
select * from StockTickEvent where price between 55 and 60
The equivalent expression without between is:
select * from StockTickEvent where price >= 55 and price <= 60
And also equivalent to:
select * from StockTickEvent where price between 60 and 55
The like keyword provides standard SQL pattern matching. SQL pattern matching allows you to use '_' to match any single character and '%' to match an arbitrary number of characters (including zero characters). In Esper, SQL patterns are case-sensitive by default. The syntax of like is:
test_expression [not] like pattern_expression [escape string_literal]
The test_expression is any valid expression yielding a String-type or a numeric result. The optional not keyword specifies that the result of the predicate be negated. The like keyword is followed by any valid standard SQL pattern_expression yielding a String-typed result. The optional escape keyword signals the escape character to escape '_' and '%' values in the pattern.
The result of a like expression is of type Boolean. If the value of test_expression matches the pattern_expression, the result value is true. Otherwise, the result value is false.
An example for the like keyword is below.
select * from PersonLocationEvent where name like '%Jack%'
The escape character can be defined as follows. In this example the where-clause matches events where the suffix property is a single '_' character.
select * from PersonLocationEvent where suffix like '!_' escape '!'
The regexp keyword is a form of pattern matching based on regular expressions implemented through the Java java.util.regex package. The syntax of regexp is:
test_expression [not] regexp pattern_expression
The test_expression is any valid expression yielding a String-type or a numeric result. The optional not keyword specifies that the result of the predicate be negated. The regexp keyword is followed by any valid regular expression pattern_expression yielding a String-typed result.
The result of a regexp expression is of type Boolean. If the value of test_expression matches the regular expression pattern_expression, the result value is true. Otherwise, the result value is false.
An example for the regexp keyword is below.
select * from PersonLocationEvent where name regexp '*Jack*'
The any operator is true if the expression returns true for one or more of the values returned by a list of expressions including array, Collection and Map values.
The synopsis for the any keyword is as follows:
expression operator any (expression [,expression...] )
The left-hand expression is evaluated and compared to each expression result using the given operator, which must yield a Boolean result. The result of any is "true" if any true result is obtained. The result is "false" if no true result is found (including the special case where the expressions are collections that return no rows).
The operator can be any of the following values: =, !=, <>, <, <=, >, >=.
The some keyword is a synonym for any. The in construct is equivalent to = any.
Expression may also return an array, a java.util.Collection or a java.util.Map. Thus event properties that are lists, sets or maps may provide values to compare against.
All expressions must be of the same type or a compatible type. The any keyword coerces number values to compatible types. If expression returns an array, then the component type of the array must be compatible, unless the component type of the array is Object.
If expression returns an array, the operation compares each element of the array.
If expression returns a Collection, the operation determines if the collection contains the value returned by the left-hand expression, applying contains semantics. When using relationship operators <, <=, >, >= the operator applies to each element in the collection, and non-numeric elements are ignored.
If expression returns a Map, the operation determines if the map contains the key value returned by the left-hand expression, applying containsKey semantics. When using relationship operators <, <=, >, >= the operator applies to each key in the map, and non-numeric map keys are ignored.
Constants, arrays, Collection and Map expressions or event properties can be used combined.
The next statement demonstrates the use of the any operator:
select * from ProductOrder where category != any (categoryArray)
The above query selects ProductOrder event that have a category field and a category array, and returns only those events in which the category value is not in the array.
Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the any construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values.
The all operator is true if the expression returns true for all of the values returned by a list of expressions including array, Collection and Map values.
The synopsis for the all keyword is as follows:
expression operator all (expression [,expression...] )
The left-hand expression is evaluated and compared to each expression result using the given operator, which must yield a Boolean result. The result of all is "true" if all rows yield true (including the special case where the expressions are collections that returns no rows). The result is "false" if any false result is found. The result is null if the comparison does not return false for any row, and it returns null for at least one row.
The operator can be any of the following values: =, !=, <>, <, <=, >, >=.
The not in construct is equivalent to != all.
Expression may also return an array, a java.util.Collection or a java.util.Map. Thus event properties that are lists, sets or maps may provide values to compare against.
All expressions must be of the same type or a compatible type. The all keyword coerces number values to compatible types. If expression returns an array, then the component type of the array must be compatible, unless the component type of the array is Object.
If expression returns an array, the operation compares each element of the array.
If expression returns a Collection, the operation determines if the collection contains the value returned by the left-hand expression, applying contains semantics. When using relationship operators <, <=, >, >= the operator applies to each element in the collection, and non-numeric elements are ignored.
If expression returns a Map, the operation determines if the map contains the key value returned by the left-hand expression, applying containsKey semantics. When using relationship operators <, <=, >, >= the operator applies to each key in the map, and non-numeric map keys are ignored.
Constants, arrays, Collection and Map expressions or event properties can be used combined.
The next statement demonstrates the use of the all operator:
select * from ProductOrder where category = all (categoryArray)
The above query selects ProductOrder event that have a category field and a category array, and returns only those events in which the category value matches all values in the array.
Single-row functions return a single value for every single result row generated by your statement. These functions can appear anywhere where expressions are allowed.
Esper allows static Java library methods as single-row functions, and also features built-in single-row functions. In addition, Esper allows instance method invocations on named streams.
Esper auto-imports the following Java library packages:
java.lang.*
java.math.*
java.text.*
java.util.*
Thus Java static library methods can be used in all expressions as shown in below example:
select symbol, Math.round(volume/1000) from StockTickEvent.win:time(30 sec)
In general, arbitrary Java class names have to be fully qualified (e.g. java.lang.Math) but Esper provides a mechanism for user-controlled imports of classes and packages as outlined in Section 10.4.5, “Class and package imports”.
The below table outlines the built-in single-row functions available.
Table 7.1. Syntax and results of single-row functions
| Single-row Function | Result |
|---|---|
case value when compare_value then result [when compare_value then result ...] [else result] end | Returns result where the first value equals compare_value. |
case when condition then result [when condition then result ...] [else result] end | Returns the result for the first condition that is true. |
cast(expression, type_name) | Casts the result of an expression to the given type. |
coalesce(expression, expression [, expression ...]) | Returns the first non-null value in the list, or null if there are no non-null values. |
current_timestamp[()] | Returns the current engine time as a long millisecond value. Reserved keyword with optional parenthesis. |
exists(dynamic_property_name) | Returns true if the dynamic property exists for the event, or false if the property does not exist. |
instanceof(expression, type_name [, type_name ...]) | Returns true if the expression returns an object whose type is one of the types listed. |
max(expression, expression [, expression ...]) | Returns the highest numeric value among the 2 or more comma-separated expressions. |
min(expression, expression [, expression ...]) | Returns the lowest numeric value among the 2 or more comma-separated expressions. |
prev(expression, event_property) | Returns a property value of a previous event, relative to the event order within a data window |
prior(integer, event_property) | Returns a property value of a prior event, relative to the natural order of arrival of events |
The case control flow function has two versions. The first version takes a value and a list of compare values to compare against, and returns the result where the first value equals the compare value. The second version takes a list of conditions and returns the result for the first condition that is true.
The return type of a case expression is the compatible aggregated type of all return values.
The example below shows the first version of a case statement. It has a String return type and returns the value 'one'.
select case 1 when 1 then 'one' when 2 then 'two' else 'more' end from ...
The second version of the case function takes a list of conditions. The next example has a Boolean return type and returns the boolean value true.
select case when 1>0 then true else false end from ...
The cast function casts the return type of an expression to a designated type. The function accepts two parameters: The first parameter is the property name or expression that returns the value to be casted. The second parameter is the type to cast to.
Valid parameters for the second (type) parameter are:
Any of the Java built-in types: int, long, byte, short, char, double, float, string, BigInteger, BigDecimal, where string is a short notation for java.lang.String and BigInteger as well as BigDecimal are the classes in java.math. The type name is not case-sensitive. For example:
cast(price, double)
The fully-qualified class name of the class to cast to, for example:
cast(product, org.myproducer.Product)
The cast function is often used to provide a type for dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type. These properties are always of type java.lang.Object.
The cast function as shown in the next statement casts the dynamic "price" property of an "item" in the OrderEvent to a double value.
select cast(item.price?, double) from OrderEvent
The cast function returns a null value if the expression result cannot be casted to the desired type, or if the expression result itself is null.
The cast function adheres to the following type conversion rules:
For all numeric types, the cast function utilitzes java.lang.Number to convert numeric types, if required.
For casts to string or java.lang.String, the function calls toString on the expression result.
For casts to other objects including application objects, the cast function considers a Java class's superclasses as well as all directly or indirectly-implemented interfaces by superclasses .
The result of the coalesce function is the first expression in a list of expressions that returns a non-null value. The return type is the compatible aggregated type of all return values.
This example returns a String-typed result of value 'foo':
select coalesce(null, 'foo') from ...
The current_timestamp function is a reserved keyword and requires no parameters. The result of the current_timestamp function is the long-type millisecond value of the current engine system time.
The function returns the current engine timestamp at the time of expression evaluation. When using external-timer events, the function provides the last value of the externally-supplied time at the time of expression evaluation.
This example selects the current engine time:
select current_timestamp from MyEvent // equivalent to select current_timestamp() from MyEvent
The exists function returns a boolean value indicating whether the dynamic property, provided as a parameter to the function, exists on the event. The exists function accepts a single dynamic property name as its only parameter.
The exists function is for use with dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type. Dynamic properties return a null value if the dynamic property does not exists on an event, or if the dynamic property exists but the value of the dynamic property is null.
The exists function as shown next returns true if the "item" property contains an object that has a "serviceName" property. It returns false if the "item" property is null, or if the "item" property does not contain an object that has a property named "serviceName" :
select exists(item.serviceName?) from OrderEvent
The instanceof function returns a boolean value indicating whether the type of value returned by the expression is one of the given types. The first parameter to the instanceof function is an expression to evaluate. The second and subsequent parameters are Java type names.
The function determines the return type of the expression at runtime by evaluating the expression, and compares the type of object returned by the expression to the defined types. If the type of object returned by the expression matches any of the given types, the function returns true. If the expression returned null or a type that does not match any of the given types, the function returns false.
The instanceof function is often used in conjunction with dynamic (unchecked) properties. Dynamic properties are properties whose type is not known at compile type.
This example uses the instanceof function to select different properties based on the type:
select case when instanceof(item, com.mycompany.Service) then serviceName? when instanceof(item, com.mycompany.Product) then productName? end from OrderEvent
The instanceof function returns false if the expression tested by instanceof returned null.
Valid parameters for the type parameter list are:
Any of the Java built-in types: int, long, byte, short, char, double, float, string, where string is a short notation for java.lang.String. The type name is not case-sensitive. For example, the next function tests if the dynamic "price" property is either of type float or type double:
instanceof(price?, double, float)
The fully-qualified class name of the class to cast to, for example:
instanceof(product, org.myproducer.Product)
The function considers an event class's superclasses as well as all the directly or indirectly-implemented interfaces by superclasses.
The min and max function take two or more parameters that itself can be expressions. The min function returns the lowest numeric value among the 2 or more comma-separated expressions, while the max function returns the highest numeric value. The return type is the compatible aggregated type of all return values.
The next example shows the max function that has a Double return type and returns the value 1.1.
select max(1, 1.1, 2 * 0.5) from ...
The min function returns the lowest value. The statement below uses the function to determine the smaller of two timestamp values.
select symbol, min(ticks.timestamp, news.timestamp) as minT from StockTickEvent.win:time(30 sec) as ticks, NewsEvent.win:time(30 sec) as news where ticks.symbol = news.symbol
The prev function returns the property value of a previous event. The first parameter denotes the i-th previous event in the order established by the data window. The second parameter is a property name for which the function returns the value for the previous event.
This example selects the value of the price property of the 2nd-previous event from the current Trade event.
select prev(2, price) from Trade.win:length(10)
Since the prev function takes the order established by the data window into account, the function works well with sorted windows. In the following example the statement selects the symbol of the 3 Trade events that had the largest, second-largest and third-largest volume.
select prev(0, symbol), prev(1, symbol), prev(2, symbol) from Trade.ext:sort(volume, true, 10)
The i-th previous event parameter can also be an expression returning an Integer-type value. The next statement joins the Trade data window with an RankSelectionEvent event that provides a rank property used to look up a certain position in the sorted Trade data window:
select prev(rank, symbol) from Trade.ext:sort(volume, true, 10), RankSelectionEvent
And the expression count(*) - 1 allows us to select the oldest event in the length window:
select prev(count(*) - 1, price) from Trade.win:length(100)
The prev function returns a null value if the data window does not currently hold the i-th previous event. The example below illustrates this using a time batch window. Here the prev function returns a null value for any events in which the previous event is not in the same batch of events. Note that the prior function as discussed below can be used if a null value is not the desired result.
select prev(1, symbol) from Trade.win:time_batch(1 min)
The combination of prev function and group-by view returns the property value for a previous event in the given group.
Let's look at an example. Assume we want to obtain the price of the previous event of the same symbol as the current event.
The statement that follows solves this problem. It declares a group-by view grouping on the symbol property and a time window of 1 minute. As a result, when the engine encounters a new symbol value that it hasn't seen before, it creates a new time window specifically to hold events for that symbol. Consequently, the previous function returns the previous event within the respective time window for that event's symbol value.
select prev(1, price) as prevPrice from Trade.std:groupby(symbol).win:time(1 min)
In a second example, assume we need to return, for each event, the current top price per symbol. We can use the prev to obtain the highest price from a sorted data window, and use the group-by view to group by symbol:
select prev(0, price) as topPricePerSymbol from Trade.std:groupby(symbol).ext:sort(price, false, 1)