RAG pipelines powered by the freshest knowledge

For your documents & live data
Pathway RAG - indexer flow
Spin up your all-inclusive RAG pipelines in minutes.One containerized service, no infrastructure dependencies.
  • High accuracy knowledge retrieval
  • Live synchronization with data sources
  • Unstructured document support (PDF, DOC,...)
  • Fast built-in vector indexing up to millions of documents*
Try out RAG demo
* Contact us for help with setting up enterprise use cases at scale.
Fully customizable in Python
Deploys with Kubernetes
Runs on cloud, runs in a data center, runs in a Faraday cage
Pathway’s hosted pipelines integrated seamlessly with our SharePoint environment and empowered three of our teams to rapidly independently build Gen AI apps.

Ari Bajo Rouvinen - Data Engineer
Backed by Enterprise Security & Authentication
  • Host on your cloud or on-premise
  • Secure by Design
  • Granular Access Management
  • Compliance-Ready

Try it out

Interact with demo pipelines

DEMO: Uploaded files will be visible to the public.Data processing is subject to our privacy policy
Realtime Document Indexing with Pathway

This is a basic service for a real-time document indexing pipeline powered by Pathway.

The capabilities of the service include:

  • Real-time document indexing from Microsoft 365 SharePoint
  • Real-time document indexing from Google Drive
  • Similarity search by user query
  • Filtering by the metadata according to the condition given in JMESPath format
  • Basic stats on the indexer's health

Supported document formats include plaintext, pdf, docx, and HTML. For the complete list, please refer to the supported formats of the unstructured library. In addition, this pipeline is capable of data removals: you can delete files and in a few seconds, a similarity search will undo the changes done to the index by their addition.

Please also keep in mind the following constraints and limitations:

  • The maximum supported file size is 4 MB and 100 Kb of the plaintext is obtained after parsing. Anything of the greater size will be ignored by the indexer
  • The files in the shared spaces are removed within 15 minutes after their addition
  • You hold responsibility for the contents of the files you upload