pw.io.bigquery
write(table, dataset_name, table_name, service_user_credentials_file, *, name=None, sort_by=None)
sourceWrites table
’s stream of changes into the specified BigQuery table. Please note
that the schema of the target table must correspond to the schema of the table that is
being outputted and include two additional fields: an integral field time
, denoting the
ID of the minibatch where the change occurred and an integral field diff
which can be
either 1 or -1 and which denotes if the entry was inserted to the table or if it was deleted.
Note that the modification of the row is denoted with a sequence of two operations:
the deletion operation (diff = -1
) and the insertion operation (diff = 1
).
- Parameters
- table (
Table
) – The table to output. - dataset_name (
str
) – The name of the dataset where the table is located. - table_name (
str
) – The name of the table to be written. - service_user_credentials_file (
str
) – Google API service user json file. Please follow the instructions provided in the developer’s user guide to obtain them. - name (
str
|None
) – A unique name for the connector. If provided, this name will be used in logs and monitoring dashboards. - sort_by (
Optional
[Iterable
[ColumnReference
]]) – If specified, the output will be sorted in ascending order based on the values of the given columns within each minibatch. When multiple columns are provided, the corresponding value tuples will be compared lexicographically.
- table (
- Returns
None
Example:
Suppose that there is a Google BigQuery project with a dataset named animals
and
you want to output the Pathway table animal_measurements
into this dataset’s
table measurements
.
Consider that the credentials are stored in the file ./credentials.json
. Then,
you can configure the output as follows:
pw.io.bigquery.write(
animal_measurements,
dataset_name="animals",
table_name="measurements",
service_user_credentials_file="./credentials.json"
)