Table Operations: An Overview
This section describes the basic transformations you can do with Pathway.
Assignment and renaming
You can create a column in a table using the select and assignment (=) operators:
t.select(new_col=t.colA + t.colB)t.select(new_col="default value")
To rename a column, you can do the same (use select) or use rename:
t.select(new_col=t.old_col)t.rename(new_col=t.old_col)
Selection and indexing
| Description | Operators | Example |
|---|---|---|
| Select a column | select and dot/bracket notations |
t.select(t.colA, t['colB'], pw.this.colC)
|
| Select all columns | select and star notation |
t.select(*pw.this)
|
| Removing columns | without and dot/bracket notations |
t.without(t.colA, t.colB)
|
| Referring to the current table | pw.this |
t1.select(new_col=pw.this.colA + pw.this.colB)
|
| Referring to a table in a join/window | pw.left and pw.right notations |
t1.join(t2, pw.left.colA == pw.right.colB).reduce(*pw.left, pw.right.colC)
|
| Reference indexing | ix_ref |
t_selected_ids.select(selected=t.ix_ref(column).name)
|
| Reindexing | with_id_from |
t.with_id_from(t_new_ids.new_id_source)
|
Arithmetic operators
| Description | Operators | Example |
|---|---|---|
| Addition |
+
|
t.select(new_col=t.colA + t.colB)
|
| Subtraction |
-
|
t.select(new_col=t.colA - t.colB)
|
| Multiplication |
*
|
t.select(new_col=t.colA * t.colB)
|
| Division |
/
|
t.select(new_col=t.colA / t.colB)
|
| Floor division |
//
|
t.select(new_col=t.colA // t.colB)
|
| Modulus |
%
|
t.select(new_col=t.colA % t.colB)
|
| Exponentiation |
**
|
t.select(new_col=t.colA ** t.colB)
|
Comparisons operators
| Description | Operators | Example |
|---|---|---|
| Equal |
==
|
t.select(new_col=t.colA == t.colB)
|
| Not equal |
!=
|
t.select(new_col=t.colA != t.colB)
|
| Greater than |
>
|
t.select(new_col=t.colA > t.colB)
|
| Less than |
<
|
t.select(new_col=t.colA < t.colB)
|
| Greater than or equal to |
>=
|
t.select(new_col=t.colA >= t.colB)
|
| Less than or equal to |
<=
|
t.select(new_col=t.colA <= t.colB)
|
Boolean operators
| Description | Operators | Example |
|---|---|---|
| And |
&
|
t.select(new_col=t.colA & t.colB)
|
| Or |
|
|
t.select(new_col=t.colA | t.colB)
|
| Not |
~
|
t.select(new_col=~t.colA)
|
| Exclusive or (XOR) |
^
|
t.select(new_col=t.colA ^ t.colB)
|
Filtering
You can filter rows using the filter operator:
t.filter(~pw.this.column)t.filter(pw.this.column > value)
Missing data
The pw.coalesce operator returns the first not-None value from the given columns:
t.select(new_col=pw.coalesce(t.colA, t.colB))t.select(new_col=pw.coalesce(t.colA, 10)
Aggregation
You can aggregate data across the rows of the table using the groupby and reduce operators:
t.groupby(pw.this.column).reduce(sum=pw.reducers.sum(pw.this.value))
You can read our dedicated tutorial to learn more about it.
Reducers
Pathway provides several reducers to use on the aggregated values:
| Reducer | Example |
|---|---|
| any |
t.groupby(t.colA).reduce(col_any=pw.reducers.any(t.colB))
|
| argmax |
t.groupby(t.colA).reduce(col_argmax=pw.reducers.argmax(t.colB))
|
| argmin |
t.groupby(t.colA).reduce(col_argmin=pw.reducers.argmin(t.colB))
|
| avg |
t.groupby(t.colA).reduce(col_avg=pw.reducers.avg(t.colB))
|
| earliest |
t.groupby(t.colA).reduce(col_min=pw.reducers.earliest(t.colB))
|
| latest |
t.groupby(t.colA).reduce(col_max=pw.reducers.latest(t.colB))
|
| max |
t.groupby(t.colA).reduce(col_max=pw.reducers.max(t.colB))
|
| min |
t.groupby(t.colA).reduce(col_min=pw.reducers.min(t.colB))
|
| ndarray |
t.groupby(t.colA).reduce(col_array=pw.reducers.ndarray(t.colB))
|
| sorted_tuple |
t.groupby(t.colA).reduce(col_tuple=pw.reducers.sorted_tuple(t.colB))
|
| sum |
t.groupby(t.colA).reduce(col_sum=pw.reducers.sum(t.colB))
|
| tuple |
t.groupby(t.colA).reduce(col_tuple=pw.reducers.tuple(t.colB))
|
| unique |
t.groupby(t.colA).reduce(col_unique=pw.reducers.unique(t.colB))
|
You can also create your own stateful reducers.
Joins
You can use a join to combine columns from two different tables by associating rows from both tables which are matching on some given values:
t1.join(t2, pw.left.column == pw.right.column).select(...)
Read our tutorial about joins to learn more about how to do joins in Pathway.
Union and Concatenation
| Description | Operators | Example |
|---|---|---|
| Union | + or += |
t1 + t2
t1 += t2 modifies t1 |
| Concatenation | concat_reindex |
pw.Table.concat_reindex(t1, t2)
|
Updating cell values
You can update the cells of a table using the content of another table using the update_cells operator (<<):
t.update_cells(t_new)t << t_new
Flattening a column
You can transform a column containing iterables or JSON arrays into multiple rows using the flatten operator:
t.flatten(t.col_to_flatten)
Column operations
| Description | Operators | Example |
|---|---|---|
| Applying a function to each cell of a column. | pw.apply in a select |
t.select(new_col=pw.apply(func, pw.this.col))
|
| Folding columns into a single one. |
pw.make_tuple
|
t.select(new_col=pw.make_tuple(t.a, t.b, t.c))
|
User-defined functions (UDF)
Pathway allows you to define your own User-defined functions. See our tutorial to learn more about it.