pathway.stdlib.utils package
Submodules
pathway.stdlib.utils.bucketing module
pathway.stdlib.utils.col module
pathway.stdlib.utils.col.apply_all_rows(*cols, fun, result_col_name)
Applies a function to all the data in selected columns at once, returning a single column. This transformer is meant to be run infrequently on a relativelly small tables.
Input:
- cols: list of columns to which function will be applied
- fun: function taking lists of columns and returning a corresponding list of outputs.
- result_col_name: name of the output column
Output:
- Table indexed with original indices with a single column named by “result_col_name” argument containing results of the apply
- Return type
Table
pathway.stdlib.utils.col.flatten_column(column, origin_id=ColumnReference|<class 'pathway.internals.thisclass.this'>|.origin_id)
Flattens a column of a table.
Input:
- column: Column expression of column to be flattened
- origin_id: name of output column where to store id’s of input rows
Output:
- Table with columns: colname_to_flatten and origin_id (if not None)
- Return type
Table
pathway.stdlib.utils.col.groupby_reduce_majority(column_group, column_val)
Finds a majority in column_val for every group in column_group.
Workaround for missing majority reducer.
pathway.stdlib.utils.col.multiapply_all_rows(*cols, fun, result_col_names)
Applies a function to all the data in selected columns at once, returning multiple columns. This transformer is meant to be run infrequently on a relativelly small tables.
Input:
- cols: list of columns to which function will be applied
- fun: function taking lists of columns and returning a corresponding list of outputs.
- result_col_names: names of the output columns
Output:
- Table indexed with original indices with columns named by “result_col_names” argument containing results of the apply
- Return type
Table
pathway.stdlib.utils.col.unpack_col(column, *args)
Unpacks multiple columns from a single column.
Input:
- column: Column expression of column containing some sequences
- names: list of names of output columns
Output:
- Table with columns named by “names” argument
- Return type
Table
pathway.stdlib.utils.filtering module
pathway.stdlib.utils.pandas_transformer module
pathway.stdlib.utils.pandas_transformer.pandas_transformer(output_schema, output_universe=None)
Decorator that turns python function operating on pandas.DataFrame into pathway transformer.
Input universes are converted into input DataFrame indexes. The resulting index is treated as the output universe, so it must maintain uniqueness and be of integer type.
- Parameters
- output_schema (
Type
[Schema
]) – Schema of a resulting table. - output_universe (
Union
str
,int
,None
) – Index or name of an argument whose universe will be used - None. (in resulting table. Defaults to) –
- output_schema (
- Returns
Transformer that can be applied on pathawy tables.