The Easiest Solution for Python Stream Processing, Data Indexing, and Real-Time AI Analytics.
- Simplicity at all stages: Installation, integration, development, and deployment.
- Low latency: Pathway is the fastest data processing framework, thanks to its powerful Rust engine.
- Batch and stream processing alike: Pathway handles batch and streaming data for you in the same way.
It's simple. From installation to deployment.
With its full Python compatibility, Pathway is easy to use, from the installation to the maintenance.
- Python native: Pathway is a Python framework, and as such it is compatible with the whole Python ecosystem. It will integrate perfectly into your Python architecture and will allow you to use your favorite libraries.
- Installation: you can install Pathway with a simple
pip install pathway
. - Many data sources: Pathway provides a multitude of connectors to access your favorite data sources. You can also set up your own.
- Transformation and Machine Learning: you can easily design your data pipeline using Pathway transformations. You can define your own UDFs, use any Python library, and integrate Machine Learning models.
- Many destinations: Pathway provides output connectors to send the results to the destination you want. You can also create your own.
- RAG and LLM-ready: Pathway provides most of the common utilities to develop your LLM applications and RAG. This includes complete AI pipelines with structured and unstructured data ingestion, chunking, and indexing.
- Data indexing: Pathway offers real-time data indexes (vector search, full text search, and more) allowing you to effortlessly synchronize your index with data sources in real time. Don't bother installing a dedicated vector store, Pathway has got it covered for you!
It's fast, scalable, and safe.
- A powerful Rust engine: Pathway is not bound to Python limits as it relies on a powerful Rust engine. The engine ensures that the computations are fast.
- Scalable: thanks to the Rust engine, Pathway provides multi-threading, multi-processing and distributed computations. You can easily deploy your Pathway pipeline in the cloud.
- Differential Dataflow and incremental computations: Pathway's engine incrementally processes data updates. Results are computed using the minimum work needed, ensuring high latency.
- Stateful operations: Pathway supports stateful operations such as groupby and windows.
- Persistence: you can save the state of the ongoing computation, be it for updating your pipeline or for recovery.
It takes the pain out of temporal & event data
- Batch and stream processing alike: Pathway does both batch and stream processing. No matter your use case, Pathway is a good fit.
- Same syntax: Your pipeline can run on both batch and streaming data, without modifying your code.
- Same engine: Pathway unified Rust engine makes your computation fast and scalable, no matter if you choose batch or stream processing.
- Consistent results: for stream processing, Pathway returns an output in real-time, which is what you would have if you were processing the received data using batch processing.
- Streaming complexity is hidden: All the challenges of stream processing, such as handling late and out-of-order data, are handled by the engine and hidden from the user.
- Time-related operations: Pathway offers advanced temporal operations such as as-of-join and temporal windows.
Python + Rust: the best of both worlds
Pathway efficiently associates the convenience of Python with the power of Rust.
Python makes everything easy
Pathway is a fully Python-compatible framework.
You can install it with a simple pip install pathway
and import it as any Python library.
Pathway provides a Python interface and experience created with data developers in mind.
You can easily build pipelines by manipulating Pathway tables and rely on the vast resources and libraries of the Python ecosystem.
Also, Pathway can seamlessly be integrated into your CI/CD chain as it is inherently compatible with popular tools such as mypy or pytest.
Your Pathway pipelines can be automatically tested, built, and deployed like any other Python workflow.
Pathway can be easily deployed in any container-based method (docker, Kubernetes) supporting the deployment of Python-based projects.
Rust makes your pipeline fast and scalable
Pathway relies on a powerful Rust engine to ensure high performance for your pipelines, no matter if you are dealing with batch or streaming data. Pathway engine makes the utmost of Rust speed and memory safety to provide efficient parallel and distributed processing without being limited by Python's GIL.
Pathway engine is based on Differential dataflow, a computational framework known for its efficiency to process large volumes of data. Its incremental computations make it able to quickly process data updates. This means that the minimum work needed by any algorithm or transformation is performed to refresh its results when fresh data arrives.
A unified framework to end the debate between batch and stream processing
Batch processing and stream processing are seen as two distinct approaches to handling data.
Pathway is a unified data processing framework that allows you to use the same code for batch and streaming. All the complexity, including late data and consistency, are automatically handled and hidden from the user. Pathway provides advanced streaming operations, such as temporal windows, while keeping the simplicity of batch processing.
With Pathway, you don't have to choose between batch and stream processing. You can make your pipeline and focus on the data transformation you want to do. The resulting pipeline will work with both batch and stream processing. Not having to distinguish between batch and stream --and use different tools for them-- highly simplifies your architecture (bye-bye Lambda architecture) and the development of your pipeline.
What can it be used for?
With its unified engine and full Python compatibility, Pathway makes data processing as easy as possible. It's the ideal solution for a wide range of data processing pipelines, including:
- Real-time analytics on IoT and event data.
- AI RAG pipelines at scale.
- Real-time Document Indexing.
- ETL on unstructured data.
Learn more about the real-world applications of Pathway on our solutions page and our success stories page.