TEST

Sliding Window

A strategy for processing (stream) data by specific limited frames, usually time periods. A sliding window moves through the data stream in a fixed-size, overlapping manner. Each window collects and processes a fixed number of data items or a fixed duration of data, after which the window is moved forward by a fixed amount. Sliding windows differs from fixed windows by allowing data overlap. This means a single event can belong to multiple sliding windows.

What's the difference between a sliding window and a tumbling window?

The difference between a

and a sliding window is whether the intervals overlap. Tumbling windows are intervals that do not overlap. Sliding windows are intervals that do overlap.

For realtime monitoring you would usually prefer a sliding window over tumbling ones as the latter cut the data in non-overlapping parts: a wrong cut could prevent it from detecting the pattern you are looking for.

How can I perform a sliding window in Python?

You can use Pathway to perform sliding window operations on your data:

>>> import pathway as pw
>>> t = pw.debug.table_from_markdown(
... '''
...        | shard | t
...    1   | 0     |  12
...    2   | 0     |  13
...    3   | 0     |  14
...    4   | 0     |  15
...    5   | 0     |  16
...    6   | 0     |  17
...    7   | 1     |  10
...    8   | 1     |  11
... ''')
>>> result = t.windowby(
...     t.t, window=pw.window.sliding(duration=10, hop=3), shard=t.shard
... ).reduce(
...   pw.this.window,
...   min_t=pw.reducers.min(pw.this.t),
...   max_t=pw.reducers.max(pw.this.t),
...   count=pw.reducers.count(pw.this.t),
... )
>>> pw.debug.compute_and_print(result, include_id=False)
window      | min_t | max_t | count
(0, 3, 13)  | 12    | 12    | 1
(0, 6, 16)  | 12    | 15    | 4
(0, 9, 19)  | 12    | 17    | 6
(0, 12, 22) | 12    | 17    | 6
(0, 15, 25) | 15    | 17    | 3
(1, 3, 13)  | 10    | 11    | 2
(1, 6, 16)  | 10    | 11    | 2
(1, 9, 19)  | 10    | 11    | 2