Transactions in Google BigQuery

ยท 441 words ยท 3 minute read

Transactions enable running multiple queries, on multiple tables, such that either all of them succeeds or none runs.

This allows performing mutating operations on the target tables, such as inserting or deleting rows, and either commit to all changes or roll back atomically.

Transactions guarantee ACID properties. During a transaction, all reads return a consistent snapshot of the tables.

If a statement in a multi-statement transaction modifies a table, the changes are visible subsequent statements within the same transaction.

Example use case ๐Ÿ”—

The incoming order data from an ETL pipeline brings orders for the last 24 hours, everyday. Orders can be mutated within the 24 hour period in different ways. For example, inventory amount can change, or order status can change to “Cancelled” or “Refunded”.

Two queries need to run.

  1. The orders from the incoming set needs to be inserted to the destination table.
  2. Older version of the same orders need to be deleted from the destination table.

Using a transaction for these two queries ensures atomicity

Transaction structure ๐Ÿ”—

BEGIN TRANSACTION statement starts a transaction. A transaction ends when any of the following occurs:

  1. A COMMIT TRANSACTION statement is executed. This statement commits, or executes, all statements inside the transaction.
  2. A ROLLBACK TRANSACTION statement is executed. This will abandon all changes made inside the transaction.
  3. The query ends before reaching either of the above statements. In this case, a roll back is executed.

If an error occurs and the query has an exception handler, the control is taken inside the exception handler. Can choose to either commit or rollback inside the exception block.

If an error occurs and there is no exception handler, then the transaction fails and all changes are rolled back.

Transaction example ๐Ÿ”—

BEGIN TRANSACTION;
INSERT INTO mydataset.NewArrivals
VALUES ('top load washer', 100, 'warehouse #1');
-- Trigger an error.
SELECT 1/0;
COMMIT TRANSACTION;

EXCEPTION WHEN ERROR THEN
  -- Roll back the transaction inside the exception handler.
  SELECT @@error.message;
  ROLLBACK TRANSACTION;

Statements supported in transactions ๐Ÿ”—

The following statement types are supported in transactions:

  1. Query statements: SELECT
  2. DML statements: INSERT, UPDATE, DELETE, MERGE, and TRUNCATE TABLE
  3. DDL statements on temporary entities:
    • CREATE TEMP TABLE
    • CREATE TEMP FUNCTION
    • DROP TABLE on a temporary table
    • DROP FUNCTION on a temporary function
  4. DDL statements on permanent entities, such as datasets, tables, and functions, are not supported inside transactions.
  5. Within a transaction, the following date/time functions have special behaviors:
    • The CURRENT_TIMESTAMP, CURRENT_DATE, and CURRENT_TIME functions return the timestamp when the transaction started.
    • FOR SYSTEM_TIME AS OF clause cannot be used to read a table beyond the timestamp when the transaction started. Doing so returns an error.

Further reading ๐Ÿ”—