pathway.stdlib.utils package

Submodules

pathway.stdlib.utils.bucketing module

pathway.stdlib.utils.col module


pathway.stdlib.utils.col.apply_all_rows(*cols, fun, result_col_name)

Applies a function to all the data in selected columns at once, returning a single column. This transformer is meant to be run infrequently on a relativelly small tables.

Input:

  • cols: list of columns to which function will be applied
  • fun: function taking lists of columns and returning a corresponding list of outputs.
  • result_col_name: name of the output column

Output:

  • Table indexed with original indices with a single column named by “result_col_name” argument containing results of the apply

pathway.stdlib.utils.col.flatten_column(column, origin_id=ColumnReference|<class 'pathway.internals.thisclass.this'>|.origin_id)

Flattens a column of a table.

Input:

  • column: Column expression of column to be flattened
  • origin_id: name of output column where to store id’s of input rows

Output:

  • Table with columns: colname_to_flatten and origin_id (if not None)

pathway.stdlib.utils.col.groupby_reduce_majority(column_group, column_val)

Finds a majority in column_val for every group in column_group.

Workaround for missing majority reducer.


pathway.stdlib.utils.col.multiapply_all_rows(*cols, fun, result_col_names)

Applies a function to all the data in selected columns at once, returning multiple columns. This transformer is meant to be run infrequently on a relativelly small tables.

Input:

  • cols: list of columns to which function will be applied
  • fun: function taking lists of columns and returning a corresponding list of outputs.
  • result_col_names: names of the output columns

Output:

  • Table indexed with original indices with columns named by “result_col_names” argument containing results of the apply

pathway.stdlib.utils.col.unpack_col(column, *args)

Unpacks multiple columns from a single column.

Input:

  • column: Column expression of column containing some sequences
  • names: list of names of output columns

Output:

  • Table with columns named by “names” argument

pathway.stdlib.utils.filtering module

pathway.stdlib.utils.pandas_transformer module


pathway.stdlib.utils.pandas_transformer.pandas_transformer(output_schema, output_universe=None)

Decorator that turns python function operating on pandas.DataFrame into pathway transformer.

Input universes are converted into input DataFrame indexes. The resulting index is treated as the output universe, so it must maintain uniqueness and be of integer type.

  • Parameters
    • output_schema (Type[Schema]) – Schema of a resulting table.
    • output_universe (Unionstr, int, None) – Index or name of an argument whose universe will be used
    • None. (in resulting table. Defaults to) –
  • Returns
    Transformer that can be applied on pathawy tables.