Join API
Contains reference for helper classes related to joins.
class pw.JoinMode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
class pw.FilteredJoinResult(join_result, filtering)
filter(filter_expression)
Filters rows, keeping the ones satisfying the predicate.Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age owner pet
1 10 Alice 1
2 9 Bob 1
3 8 Alice 2
''')
t2 = pw.debug.parse_to_table('''
age owner pet size
11 10 Alice 3 M
12 9 Bob 1 L
13 8 Tom 1 XL
''')
result = t1.join(t2).filter(t1.owner == t2.owner).select(t1.age, t2.size) # noqa: E501
pw.debug.compute_and_print(result, include_id=False)
groupby(*args, id=None)
Groups join result by columns from args.NOTE: Usually followed by .reduce() that aggregates the result and returns a table.
- Parameters
- args (
ColumnReference
) – columns to group by. - id (
Optional
[ColumnReference
]) – if provided, is the column used to set id’s of the rows of the result
- args (
- Returns
GroupedJoinResult – Groupby object.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
cost owner pet
1 100 Alice 1
2 90 Bob 1
3 80 Alice 2
''')
t2 = pw.debug.parse_to_table('''
cost owner pet size
11 100 Alice 3 M
12 90 Bob 1 L
13 80 Tom 1 XL
''')
result = (t1.join(t2, t1.owner==t2.owner).groupby(pw.this.owner)
.reduce(pw.this.owner, pairs = pw.reducers.count()))
pw.debug.compute_and_print(result, include_id=False)
join(other, *on, id=None, how=JoinMode.INNER)
Join self with other using the given join expression.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id - how (
JoinMode
) – by default, inner join is performed. Possible values are JoinMode.{INNER,LEFT,RIGHT,OUTER} correspond to inner, left, right and outer join respectively.
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(
t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER
).select(age=t1.age, owner_name=t2.owner, size=t2.size)
pw.debug.compute_and_print(t3, include_id = False)
join_inner(other, *on, id=None)
Inner-joins two tables or join results.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER).select(age=t1.age, owner_name=t2.owner, size=t2.size) # noqa: E501
pw.debug.compute_and_print(t3, include_id = False)
join_left(other, *on, id=None)
Left-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- rows from the right side that were not matched with the left side are skipped
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_left(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t2.id)),
include_id=False)
join_outer(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_outer(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t1.id,t2.id)),
include_id=False)
join_right(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- rows from the left side that were not matched with the right side are skipped
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_right(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(pw.coalesce(t1.B,0) + t2.D,t1.id)),
include_id=False)
- Returns
OuterJoinResult object
promise_universe_is_equal_to(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the others.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
None
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| pet
1 | Dog
7 | Cat
''')
t2 = pw.debug.parse_to_table('''
| age
1 | 10
7 | 3
''')
t1 = t1.promise_universe_is_equal_to(t2)
t3 = t1 + t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universe_is_subset_of(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the other.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 30
''').promise_universe_is_subset_of(t1)
t3 = t1 << t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universes_are_disjoint(other)
Asserts to Pathway that an universe of self is disjoint from universe of other.Semantics: Used in situations where Pathway cannot deduce universes are disjoint.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
11 | 11 | Alice | 30
12 | 12 | Tom | 40
''').promise_universes_are_disjoint(t1)
t3 = t1.concat(t2)
pw.debug.compute_and_print(t3, include_id=False)
reduce(*args, **kwargs)
Reduce a join result to a single row.Equivalent to self.groupby().reduce(*args, **kwargs).
- Parameters
- args (
ColumnReference
) – reducer to reduce the table with - kwargs (
ColumnExpression
) – reducer to reduce the table with. Its key is the new name of a column.
- args (
- Returns
Table – Reduced table.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
cost owner pet
1 100 Alice 1
2 90 Bob 1
3 80 Alice 2
''')
t2 = pw.debug.parse_to_table('''
cost owner pet size
11 100 Alice 3 M
12 90 Bob 1 L
13 80 Tom 1 XL
''')
result = t1.join(t2, t1.owner==t2.owner).reduce(total_pairs = pw.reducers.count())
pw.debug.compute_and_print(result, include_id=False)
select(*args, **kwargs)
Computes result of a join.- Parameters
- args (
ColumnReference
) – Column references. - kwargs (
Any
) – Column expressions with their new assigned names.
- args (
- Returns
Table – Created table.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner).select(age=t1.age, owner_name=t2.owner, size=t2.size) # noqa: E501
pw.debug.compute_and_print(t3, include_id=False)
class pw.JoinResult(_universe, _context, _left_table, _right_table, _original_left, _original_right, _substitution, _chained_join_desugaring, _joined_on_names, _join_mode)
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age owner pet
1 10 Alice 1
2 9 Bob 1
3 8 Alice 2
''')
t2 = pw.debug.parse_to_table('''
age owner pet size
11 10 Alice 3 M
12 9 Bob 1 L
13 8 Tom 1 XL
''')
joinresult= t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner) # noqa: E501
isinstance(joinresult, pw.JoinResult)
pw.debug.compute_and_print(joinresult.select(t1.age, t2.size), include_id=False)
filter(filter_expression)
Filters rows, keeping the ones satisfying the predicate.Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age owner pet
1 10 Alice 1
2 9 Bob 1
3 8 Alice 2
''')
t2 = pw.debug.parse_to_table('''
age owner pet size
11 10 Alice 3 M
12 9 Bob 1 L
13 8 Tom 1 XL
''')
result = t1.join(t2).filter(t1.owner == t2.owner).select(t1.age, t2.size) # noqa: E501
pw.debug.compute_and_print(result, include_id=False)
groupby(*args, id=None)
Groups join result by columns from args.NOTE: Usually followed by .reduce() that aggregates the result and returns a table.
- Parameters
- args (
ColumnReference
) – columns to group by. - id (
Optional
[ColumnReference
]) – if provided, is the column used to set id’s of the rows of the result
- args (
- Returns
GroupedJoinResult – Groupby object.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
cost owner pet
1 100 Alice 1
2 90 Bob 1
3 80 Alice 2
''')
t2 = pw.debug.parse_to_table('''
cost owner pet size
11 100 Alice 3 M
12 90 Bob 1 L
13 80 Tom 1 XL
''')
result = (t1.join(t2, t1.owner==t2.owner).groupby(pw.this.owner)
.reduce(pw.this.owner, pairs = pw.reducers.count()))
pw.debug.compute_and_print(result, include_id=False)
join(other, *on, id=None, how=JoinMode.INNER)
Join self with other using the given join expression.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id - how (
JoinMode
) – by default, inner join is performed. Possible values are JoinMode.{INNER,LEFT,RIGHT,OUTER} correspond to inner, left, right and outer join respectively.
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(
t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER
).select(age=t1.age, owner_name=t2.owner, size=t2.size)
pw.debug.compute_and_print(t3, include_id = False)
join_inner(other, *on, id=None)
Inner-joins two tables or join results.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER).select(age=t1.age, owner_name=t2.owner, size=t2.size) # noqa: E501
pw.debug.compute_and_print(t3, include_id = False)
join_left(other, *on, id=None)
Left-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- rows from the right side that were not matched with the left side are skipped
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_left(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t2.id)),
include_id=False)
join_outer(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_outer(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t1.id,t2.id)),
include_id=False)
join_right(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- rows from the left side that were not matched with the right side are skipped
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_right(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(pw.coalesce(t1.B,0) + t2.D,t1.id)),
include_id=False)
- Returns
OuterJoinResult object
promise_universe_is_equal_to(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the others.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
None
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| pet
1 | Dog
7 | Cat
''')
t2 = pw.debug.parse_to_table('''
| age
1 | 10
7 | 3
''')
t1 = t1.promise_universe_is_equal_to(t2)
t3 = t1 + t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universe_is_subset_of(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the other.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 30
''').promise_universe_is_subset_of(t1)
t3 = t1 << t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universes_are_disjoint(other)
Asserts to Pathway that an universe of self is disjoint from universe of other.Semantics: Used in situations where Pathway cannot deduce universes are disjoint.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
11 | 11 | Alice | 30
12 | 12 | Tom | 40
''').promise_universes_are_disjoint(t1)
t3 = t1.concat(t2)
pw.debug.compute_and_print(t3, include_id=False)
reduce(*args, **kwargs)
Reduce a join result to a single row.Equivalent to self.groupby().reduce(*args, **kwargs).
- Parameters
- args (
ColumnReference
) – reducer to reduce the table with - kwargs (
ColumnExpression
) – reducer to reduce the table with. Its key is the new name of a column.
- args (
- Returns
Table – Reduced table.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
cost owner pet
1 100 Alice 1
2 90 Bob 1
3 80 Alice 2
''')
t2 = pw.debug.parse_to_table('''
cost owner pet size
11 100 Alice 3 M
12 90 Bob 1 L
13 80 Tom 1 XL
''')
result = t1.join(t2, t1.owner==t2.owner).reduce(total_pairs = pw.reducers.count())
pw.debug.compute_and_print(result, include_id=False)
select(*args, **kwargs)
Computes result of a join.- Parameters
- args (
ColumnReference
) – Column references. - kwargs (
Any
) – Column expressions with their new assigned names.
- args (
- Returns
Table – Created table.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner).select(age=t1.age, owner_name=t2.owner, size=t2.size) # noqa: E501
pw.debug.compute_and_print(t3, include_id=False)
class pw.Joinable(universe)
join(other, *on, id=None, how=JoinMode.INNER)
Join self with other using the given join expression.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id - how (
JoinMode
) – by default, inner join is performed. Possible values are JoinMode.{INNER,LEFT,RIGHT,OUTER} correspond to inner, left, right and outer join respectively.
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(
t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER
).select(age=t1.age, owner_name=t2.owner, size=t2.size)
pw.debug.compute_and_print(t3, include_id = False)
join_inner(other, *on, id=None)
Inner-joins two tables or join results.- Parameters
- other (
Joinable
) – the right side of the join. - on (
ColumnExpression
) – a list of column expressions. Each must have == as the top level operation and be of the form LHS: ColumnReference == RHS: ColumnReference. - id (
Optional
[ColumnReference
]) – optional argument for id of result, can be only self.id or other.id
- other (
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
age | owner | pet
10 | Alice | 1
9 | Bob | 1
8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
age | owner | pet | size
10 | Alice | 3 | M
9 | Bob | 1 | L
8 | Tom | 1 | XL
''')
t3 = t1.join(t2, t1.pet == t2.pet, t1.owner == t2.owner, how=pw.JoinMode.INNER).select(age=t1.age, owner_name=t2.owner, size=t2.size) # noqa: E501
pw.debug.compute_and_print(t3, include_id = False)
join_left(other, *on, id=None)
Left-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- rows from the right side that were not matched with the left side are skipped
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_left(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t2.id)),
include_id=False)
join_outer(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- for rows from the left side that were not matched with the right side, missing values on the right are replaced with None
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_outer(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(t1.B + t2.D,t1.id,t2.id)),
include_id=False)
join_right(other, *on, id=None)
Outer-joins two tables or join results.- Parameters
- other (
Joinable
) – Table or join result. - *on (
ColumnExpression
) – Columns to join, syntax self.col1 == other.col2 - id (
Optional
[ColumnReference
]) – optional id column of the result
- other (
Remarks: args cannot contain id column from either of tables, as the result table has id column with auto-generated ids; it can be selected by assigning it to a column with defined name (passed in kwargs)
Behavior:
- rows from the left side that were not matched with the right side are skipped
- for rows from the right side that were not matched with the left side, missing values on the left are replaced with None
- for rows that were matched the behavior is the same as that of an inner join.
- Returns
JoinResult – an object on which .select() may be called to extract relevant columns from the result of the join.
Example:
import pathway as pw
t1 = pw.debug.table_from_markdown(
'''
| A | B
1 | 11 | 111
2 | 12 | 112
3 | 13 | 113
4 | 13 | 114
'''
)
t2 = pw.debug.table_from_markdown(
'''
| C | D
1 | 11 | 211
2 | 12 | 212
3 | 14 | 213
4 | 14 | 214
'''
)
pw.debug.compute_and_print(t1.join_right(t2, t1.A == t2.C
).select(t1.A, t2_C=t2.C, S=pw.require(pw.coalesce(t1.B,0) + t2.D,t1.id)),
include_id=False)
- Returns
OuterJoinResult object
promise_universe_is_equal_to(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the others.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
None
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| pet
1 | Dog
7 | Cat
''')
t2 = pw.debug.parse_to_table('''
| age
1 | 10
7 | 3
''')
t1 = t1.promise_universe_is_equal_to(t2)
t3 = t1 + t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universe_is_subset_of(other)
Asserts to Pathway that an universe of self is a subset of universe of each of the other.Semantics: Used in situations where Pathway cannot deduce one universe being a subset of another.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 30
''').promise_universe_is_subset_of(t1)
t3 = t1 << t2
pw.debug.compute_and_print(t3, include_id=False)
promise_universes_are_disjoint(other)
Asserts to Pathway that an universe of self is disjoint from universe of other.Semantics: Used in situations where Pathway cannot deduce universes are disjoint.
- Returns
self
NOTE: The assertion works in place.
Example:
import pathway as pw
t1 = pw.debug.parse_to_table('''
| age | owner | pet
1 | 10 | Alice | 1
2 | 9 | Bob | 1
3 | 8 | Alice | 2
''')
t2 = pw.debug.parse_to_table('''
| age | owner | pet
11 | 11 | Alice | 30
12 | 12 | Tom | 40
''').promise_universes_are_disjoint(t1)
t3 = t1.concat(t2)
pw.debug.compute_and_print(t3, include_id=False)