Table Operations: An Overview
This section describes the basic transformations you can do with Pathway.
Assignment and renaming
You can create a column in a table using the select
and assignment (=
) operators:
t.select(new_col=t.colA + t.colB)
t.select(new_col="default value")
To rename a column, you can do the same (use select
) or use rename
:
t.select(new_col=t.old_col)
t.rename(new_col=t.old_col)
Selection and indexing
Description | Operators | Example |
---|---|---|
Select a column | select and dot/bracket notations |
|
Select all columns | select and star notation |
|
Removing columns | without and dot/bracket notations |
|
Referring to the current table | pw.this |
|
Referring to a table in a join/window | pw.left and pw.right notations |
|
Reference indexing | ix_ref |
|
Reindexing | with_id_from |
|
Arithmetic operators
Description | Operators | Example |
---|---|---|
Addition |
|
|
Subtraction |
|
|
Multiplication |
|
|
Division |
|
|
Floor division |
|
|
Modulus |
|
|
Exponentiation |
|
|
Comparisons operators
Description | Operators | Example |
---|---|---|
Equal |
|
|
Not equal |
|
|
Greater than |
|
|
Less than |
|
|
Greater than or equal to |
|
|
Less than or equal to |
|
|
Boolean operators
Description | Operators | Example |
---|---|---|
And |
|
|
Or |
|
|
Not |
|
|
Exclusive or (XOR) |
|
|
Filtering
You can filter rows using the filter
operator:
t.filter(~pw.this.column)
t.filter(pw.this.column > value)
Missing data
The pw.coalesce
operator returns the first not-None value from the given columns:
t.select(new_col=pw.coalesce(t.colA, t.colB))
t.select(new_col=pw.coalesce(t.colA, 10)
Aggregation
You can aggregate data across the rows of the table using the groupby and reduce operators:
t.groupby(pw.this.column).reduce(sum=pw.reducers.sum(pw.this.value))
You can read our dedicated tutorial to learn more about it.
Reducers
Pathway provides several reducers to use on the aggregated values:
Reducer | Example |
---|---|
any |
|
argmax |
|
argmin |
|
avg |
|
earliest |
|
latest |
|
max |
|
min |
|
ndarray |
|
sorted_tuple |
|
sum |
|
tuple |
|
unique |
|
You can also create your own stateful reducers.
Joins
You can use a join to combine columns from two different tables by associating rows from both tables which are matching on some given values:
t1.join(t2, pw.left.column == pw.right.column).select(...)
Read our tutorial about joins to learn more about how to do joins in Pathway.
Union and Concatenation
Description | Operators | Example |
---|---|---|
Union | + or += |
|
Concatenation | concat_reindex |
|
Updating cell values
You can update the cells of a table using the content of another table using the update_cells
operator (<<
):
t.update_cells(t_new)
t << t_new
Flattening a column
You can transform a column containing iterables or JSON arrays into multiple rows using the flatten
operator:
t.flatten(t.col_to_flatten)
Column operations
Description | Operators | Example |
---|---|---|
Applying a function to each cell of a column. | pw.apply in a select |
|
Folding columns into a single one. |
|
|
User-defined functions (UDF)
Pathway allows you to define your own User-defined functions. See our tutorial to learn more about it.