Deploy with GCP
|
Deploy with Render
RAG pipeline with up-to-date knowledge: get answers based on documents stored in S3
This example implements a simple pipeline that answers questions based on documents stored in S3.
Each query text is first turned into a vector using OpenAI embedding service, then relevant documentation pages are found using a Nearest Neighbor index computed for documents in the corpus. A prompt is built from the relevant documentations pages and sent to the OpenAI chat service for processing.
How to run the project
Setup environment:
Set your env variables in the .env file placed in this directory.
OPENAI_API_KEY=sk-...
PATHWAY_PERSISTENT_STORAGE= # Set this variable if you want to use caching
Run with Docker
To run jointly the Alert pipeline and a simple UI execute:
docker compose up --build
Then, the UI will run at http://0.0.0.0:8501 by default. You can access it by following this URL in your web browser.
Run manually
Alternatively, you can run each service separately.
Make sure you have installed poetry dependencies.
poetry install --with examples
Then run:
poetry run python app.py
If all dependencies are managed manually rather than using poetry, you can alternatively use:
python app.py
To run the Streamlit UI, run:
streamlit run ui/server.py --server.port 8501 --server.address 0.0.0.0
Querying the pipeline
To query the pipeline, you can call the REST API:
curl --data '{
"user": "user",
"query": "How to connect to Kafka in Pathway?"
}' http://localhost:8080/ | jq
or access the Streamlit UI at 0.0.0.0:8501
.