pw.io.bigquery

pw.io.bigquery.write(table, dataset_name, table_name, service_user_credentials_file)

sourceWrites table’s stream of changes into the specified BigQuery table. Please note that the schema of the target table must correspond to the schema of the table that is being outputted and include two additional fields: an integral field time, denoting the ID of the minibatch where the change occurred and an integral field diff which can be either 1 or -1 and which denotes if the entry was inserted to the table or if it was deleted.

Note that the modification of the row is denoted with a sequence of two operations: the deletion operation (diff = -1) and the insertion operation (diff = 1).

  • Parameters
    • table (Table) – The table to output.
    • dataset_name (str) – The name of the dataset where the table is located.
    • table_name (str) – The name of the table to be written.
    • service_user_credentials_file (str) – Google API service user json file. Please follow the instructions provided in the developer’s user guide to obtain them.
  • Returns
    None

Example:

Suppose that there is a Google BigQuery project with a dataset named animals and you want to output the Pathway table animal_measurements into this dataset’s table measurements.

Consider that the credentials are stored in the file ./credentials.json. Then, you can configure the output as follows:

pw.io.bigquery.write(  
    animal_measurements,
    dataset_name="animals",
    table_name="measurements",
    service_user_credentials_file="./credentials.json"
)