Connectors in Pathway

In order to use Pathway, one of the first things you need to do is to access the data you want to manipulate. In Pathway, accessing the data is done using connectors.

Pathway comes with connectors which connect to streaming data sources at input, as well as connectors which send notifications of changes to Pathway outputs.

List of provided connectors

Pathway provides the following connectors:

  1. DSV
  2. Debezium
  3. Kafka
  4. PostgreSQL
  5. Amazon S3

The one you need is not in this list? Don't worry, more are coming!

Connectors in practice

Let's see how to use a DSV connector to load a data stream stored in a csv file:

stream = pw.csv.read("example-stream.csv")

We obtain a table stream, which contains our data stream:

pw.debug.compute_and_print(stream)
            | Unnamed: 0 | date       | amount | recipient  | sender         | recipient_acc_no             | sender_acc_no
^8JFNKVV... | 0          | 2020-06-04 | 8946   | M. Perez   | Jessie Roberts | HU30186000000000000008280573 | ES2314520000000006226902
^2TMTFGY... | 1          | 2014-08-06 | 8529   | C. Barnard | Mario Miller   | ES8300590000000002968016     | PL59879710390000000009681693
^YHZBTNY... | 2          | 2017-01-22 | 5048   | S. Card    | James Paletta  | PL65889200090000000009197250 | PL46193013890000000009427616
^SERVYWW... | 3          | 2020-09-15 | 7541   | C. Baxter  | Hector Haley   | PL40881800090000000005784046 | DE84733500000003419377
^8GR6BSX... | 4          | 2019-05-25 | 3580   | L. Prouse  | Ronald Adams   | PL44124061590000000008986827 | SI54028570008259759

While the connectors work for static data as in the example, the connectors are made for data streams.

This is where is becomes interesting: the table stream, and all the computations based on it, will be automatically updated whenever an update is received (e.g. our csv has received a new entry).

All the computations and outputs are automatically updated by Pathway to take into account the updates from the stream, without requiring any operation from your part: this is the magic of Pathway!

You can see one of our recipe to see how a full data processing pipeline works with connectors.

Conclusion

Connectors are the interface between Pathway and your data app: they connect Pathway your data streams. Once connected, the data updates are automatically integrated by Pathway in all the relevant data processing.

Pathway provides several connectors, allowing you to connect to your data in different settings in a simple and efficient way.

Olivier Ruas

Algorithm and Data Processing Magician