How to trigger TimeSeriesEngine without waiting for next window data

Summary

The problem lies in the delayed aggregation of 5-minute K-line data into 1-hour K-lines using DolphinDB’s TimeSeriesEngine. The current implementation waits for the next window’s data to arrive before triggering the aggregation, resulting in a significant delay. This delay is unacceptable for live trading.

Root Cause

The root cause of the issue is the configuration of the TimeSeriesEngine. Specifically, the useSystemTime parameter is set to false, which causes the engine to wait for the next window’s data to arrive before aggregating the current window’s data. The alternative approach using useSystemTime=true also has problems.

Why This Happens in Real Systems

This issue occurs in real systems due to the following reasons:

Inconsistent data arrival times: The 5-minute K-line data arrives at inconsistent times, making it difficult to predict when the next window’s data will arrive.
Rigid window sizes: The TimeSeriesEngine uses fixed window sizes, which can lead to delays when the data arrival times are not synchronized with the window boundaries.
Aggregation triggers: The aggregation triggers are based on the arrival of new data, which can cause delays when the data arrival times are not predictable.

Real-World Impact

The delayed aggregation has significant real-world impacts, including:

Inaccurate trading decisions: The delayed data can lead to inaccurate trading decisions, resulting in financial losses.
Inefficient risk management: The delayed data can also lead to inefficient risk management, as the trading system may not be able to respond quickly to changing market conditions.
Reduced competitiveness: The delayed data can reduce the competitiveness of the trading system, as other systems may be able to respond more quickly to changing market conditions.

Example or Code

share streamTable(1000:0, `Symbol`Timestamp`Open`High`Low`Close`Volume, [SYMBOL, TIMESTAMP, DOUBLE, DOUBLE, DOUBLE, DOUBLE, LONG] ) as kline5min
share streamTable(1000:0, `Timestamp`Symbol`Open`High`Low`Close`Volume, [TIMESTAMP, SYMBOL, DOUBLE, DOUBLE, DOUBLE, DOUBLE, LONG] ) as kline1hour

engine1 = createDailyTimeSeriesEngine(
  name = "hourlyEngine",
  windowSize = 3600000,
  step = 3600000,
  metrics = ,
  dummyTable = kline5min,
  outputTable = kline1hour,
  timeColumn = `Timestamp,
  keyColumn = `Symbol,
  useSystemTime = false,
  useWindowStartTime = false
)

subscribeTable(tableName="kline5min", actionName="agg1h", handler=append!{engine1}, msgAsTable=true)

How Senior Engineers Fix It

Senior engineers can fix this issue by:

Implementing a custom aggregation trigger: Instead of relying on the TimeSeriesEngine’s built-in aggregation trigger, senior engineers can implement a custom trigger that aggregates the data as soon as it is available.
Using a more advanced data processing framework: Senior engineers can use a more advanced data processing framework, such as Apache Kafka or Apache Flink, which provides more flexible and customizable data processing capabilities.
Optimizing the TimeSeriesEngine configuration: Senior engineers can optimize the TimeSeriesEngine configuration to reduce the delay, such as by reducing the window size or increasing the frequency of the aggregation trigger.

Why Juniors Miss It

Junior engineers may miss this issue due to:

Lack of experience with real-time data processing: Junior engineers may not have experience with real-time data processing and may not be aware of the potential delays and issues that can arise.
Insufficient understanding of the TimeSeriesEngine: Junior engineers may not fully understand the TimeSeriesEngine and its configuration options, which can lead to suboptimal configuration and delayed aggregation.
Inadequate testing and validation: Junior engineers may not thoroughly test and validate their implementation, which can lead to delays and issues going undetected.