Troubleshooting

This page provides a guide to common issues encountered when using a Pathway.

As with any powerful tool, there is a learning curve to using Pathway effectively. This guide will outline some of the most common problems developers encounter when working with Pathway, along with tips and best practices for avoiding these pitfalls. Whether you are new to Pathway or an experienced user, this guide will help you to optimize your workflow and avoid common mistakes. By following these guidelines, you can save yourself time and frustration and get the most out of Pathway.

So let's get started!

Package versioning

If a module is missing or you encounter an issue reproducing the examples displayed on the website, you will likely need the latest version of Pathway. The best solution is to reinstall it.

⚠️ Pathway requires Python 3.10 or higher and runs on Linux and MacOS.

Possible error messages:

  • This is not the real Pathway package.
  • ModuleNotFoundError: No module named 'pathway.stdlib'

Solution:

  • Reinstall Pathway by uninstalling it first:
pip uninstall pathway && pip install pathway
  • If the above solution does not work, try installing Pathway with --force-reinstall:
pip install --force-reinstall pathway

You can access your Pathway version using pw.__version_ or pathway --version in the CLI.

⚠️ Windows is currently not supported.

Docker on MacOS

When using docker on MacOS, make sure to have a linux/x86_64 platform:

  • FROM --platform:linux/x86_64 python:3.10

Nothing happens / missing output

You launch your Pathway application, and nothing happens: no outputs, no errors. The application terminates without error, but the expected output, whether stdout or a CSV file, is empty.

Your application likely builds the computation graph, but doesn't launch a computation using it. You need to trigger the computation:

  • streaming mode: use pw.run() (in addition to the use of output connectors),
  • static mode: print your table using pw.debug.compute_and_print.

Explanation:

  • Pathway's operators are used to build a pipeline modeled by a computation graph. The data is ingested only when the computation is started. In the streaming mode, the computation is launched using pw.run(), while in the static mode, the computation is triggered at each output connector call. See our article about streaming and static modes for more details.

Different universes

Error message:

  • ValueError: universes do not match

Explanation:

  • The error comes from the combination of two tables with different universes. The universe is the set of indexes of each table. Some operations, such as update_cells or update_rows, require the universes to be identical, or at least one should be a subset of the other. You can read our article about universes. Pathway will raise an error when it is impossible to infer whether two tables have the same universe. You can manually assert that the two universes are compatible with unsafe_promise_same_universe_as or unsafe_promise_universe_is_subset_of.

Solutions:

  • You can force the given operation by giving a manual guarantee that the universe will be the same:
T1=T1.unsafe_promise_same_universe_as(T2)
# OR
T1=T1.unsafe_promise_universe_is_subset_of(T2)

Type errors

Datetime

Possible error message:

  • TypeError: argument 'values': unsupported value type: Timestamp

Solution:

  • Cast to a regular timestamp (int) using datetime.timestamp

Explanation:

  • Pathway does not currently support datetime.datetime, and using those should result in such TypeError. You can cast the datetime.datetime to integer using a pw.apply with datetime.timestamp.

Still blocked?

We hope this guide has helped identify and avoid common mistakes when using Pathway. Our team is always happy to help you find a solution to your problem and ensure that you get the most out of Pathway. If you have any questions or encounter issues not covered in this guide, don't hesitate to get in touch with us on our Discord channel.

Olivier Ruas

Algorithm and Data Processing Magician