Build Your Own AI Slide Search Pipeline

By the end of this bootcamp, you will be equipped to create, deploy, and manage complex AI pipelines just like the demo available here: 🔗 Slides AI Search Demo

Here's a short video showcasing the product demo:

Pathway Slide Search solution has been showcased on Intel Tiber Cloud, during the Intel AI Summit at Paris.

GitHub Repository

📂 Slides AI Search Repository

You will use this repository to set up and run the demo locally or on the cloud. The repo provides all the necessary code and configurations to get started.

Features of the AI Slide Search App

  • 💡 Instant Search: Retrieve relevant slides in seconds.
  • Real-Time Updates: Indexing happens instantly when files are added, removed, or modified.
  • 📂 Supports Multiple Data Sources: Connect with local folders, SharePoint, Google Drive, and more.
  • 🔍 Advanced Metadata Extraction: Utilize Pathway's vision-language models to process PDF and PowerPoint slides.
  • 🛠️ Flexible and Customizable: Modify schema to fit your needs.

Architecture Overview

This app template uses Pathway to:

  1. Ingest and Parse Data: Extract content from PDFs and PPTs using Pathway’s SlideParser.
  2. Generate Embeddings: Use OpenAI’s text-embedding-ada-002 or a local embedding model to index slide content.
  3. Store Embeddings in a Vector Store: Store indexed data locally for fast retrieval.
  4. Serve a UI for Search Queries: Use Streamlit UI to interact with the search pipeline.

Next Steps:

  1. Complete the Hands-On Development Module to learn how to create these AI pipelines
  2. Explore Integrating RAG Pipelines with Local LLMs for cost-effective and private AI solutions.