Pathway is now in Open Beta

Zuzanna Stamirowska·December 5, 2022·0 min read

Pathway - the stream processing framework which takes care of data updates for you - announces a $4.5M funding round, and opens to all developers. You can try it out in a cloud notebook directly from your browser, or run it on a local Linux machine.
Run it now

Why should you use Pathway?

Pathway is a programming framework which allows you to work with streaming data as if you were working with static data, in batch mode.

Have you ever tried to make sense of streaming data? If so, there is a high chance that you encountered at least one of these issues:

  • There are these annoying data updates that need to be taken care of
  • One needs to use the same logic to handle real-time and historical data
  • Debugging is a nightmare, because how can you debug something against unknown data?
  • Not to mention applying proper Machine Learning on top of streaming data to draw business insights from it. Business insights, which are necessary for key decision-making.

If these are problems you have been up against, you are in the right place. At Pathway, we design the programming framework which quietly takes care of data updates for you.

It gives you:

  • A native real-time approach. Every task is either real-time streaming or streaming with historical data (backfilling), no need for batch, no hacks required.
  • Reactivity.
  • Full power of Python (to make all your ML dreams come true) with an extra SQL syntax layer coming soon (to make sure all data engineers are happy with their Pathway pipelines, too).

In the design of Pathway's streaming engine, we opted for ease-of-use and scalability.

In Machine Learning, the key to success of a programming framework is how to combine usability with scalability. This was the axis of competition between Google's TensorFlow and Facebook's PyTorch during the deep learning revolution. Today, Pathway has taken into account the lessons learned during this battle of giants, and embedded them in the compiler of its real-time data processing framework.

Lukasz Kaiser
Co-author of Tensor Flow and co-inventor of Transformers, now at OpenAI - and an angel investor in Pathway.

What does all this mean in practice, for a developer?

That you can write as if you were writing a batch data processing pipeline (well, it's actually a little more than that, as we support loops and iteration!) and have it run on streaming data.

For a start, check out this simple example of classification of handwritten digits. All of it is captured by the code below. We approach this task with a classifier from Pathway's standard library, in this case, k-Nearest-Neighbors (read more on Wikipedia). Pathway builds up the corresponding control flow graph, and updates it in streaming mode.

Using Pathway means that all Machine Learning outcomes are updated as the models learn with new samples and improve over time. Classification decisions for tested elements will also be revisited whenever they change. Such an approach is called reactive processing of streaming data. If you would like to learn more about this topic, we explain it in detail in this fresh video talk.

You will also find many more examples in our Documentation - and we are also sharing with you the whole examples pack at https://github.com/pathwaycom/pathway-examples.

All of this is now open for you to play with it, test, and have fun. You can even run it in a cloud notebook from your browser, unless you prefer to pip install directly on to your own Linux machine.

If you have some feedback on Pathway, or just some streaming use cases that are leaving you with sleepless nights, we would love to know. Join us on Discord or drop us a line!