Data Sources YAML Examples

The YAML configuration files can be used to specify the data sources from which the data will be read to be indexed in the RAG. Because the data sources are usually used in a DocumentStore, the resulting tables must contain a data column of type bytes. Usually, the data sources are defined in a parameter $sources (mind the $, this parameter will be used in the YAML by other components) as a list of connectors.

$sources:
  - !pw.io.fs.read
    path: data
    format: binary
    with_metadata: true
  - !pw.io.csv.read
    path: csv_files
    with_metadata: false

For each connector you need to specify all the necessary parameters. You can find all the connectors and learn about how they work and their associated parameters here.

File System

Read data from your file system.

SharePoint

Read your data directly from SharePoint.

Google Drive

Connect to your documents on Google Drive using the Pathway Google Drive Connector

S3

Connect to your data stored on S3.