banner



How To Create View In Hive

Data definition language (DDL) statements let you create and modify BigQuery resources using standard SQL query syntax. Currently, you can use DDL commands in BigQuery to:

  • Create tables, views, and user-defined functions (UDFs)
  • Alter tables
  • Delete tables and views
  • Create and delete row-level access policies

Required permissions

All users require the bigquery.jobs.create permission to create a job and run DDL statements. Each type of DDL statement also requires specific permissions to run. This section outlines which Identity and Access Management (IAM) roles provide these permissions and the required permissions for each type of statement.

IAM roles

The predefined IAM roles bigquery.user, bigquery.jobUser, and bigquery.admin include the required bigquery.jobs.create permission.

The bigquery.admin and bigquery.dataOwner roles include all other required permissions for running DDL statements. The bigquery.dataEditor role includes some of the required permissions, as shown in the table in the next section.

For more information about IAM roles in BigQuery, see Predefined roles and permissions or the IAM permissions reference.

Permissions for running DDL statements

Different types of DDL statements require different permissions to run, as shown in the following table:

SQL statement Permissions IAM roles Permission details
CREATE EXTERNAL TABLE bigquery.tables.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Table permissions
CREATE FUNCTION bigquery.routines.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
CREATE MATERIALIZED VIEW bigquery.tables.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Materialized view permissions
CREATE PROCEDURE bigquery.routines.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
CREATE SCHEMA bigquery.datasets.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Dataset permissions
CREATE TABLE bigquery.tables.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Table permissions
CREATE VIEW bigquery.tables.create bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
View permissions
ALTER COLUMN
DROP NOT NULL
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Table permissions
ALTER COLUMN
SET OPTIONS
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Table permissions
ALTER MATERIALIZED VIEW
SET OPTIONS
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Materialized view permissions
ALTER SCHEMA
SET OPTIONS
bigquery.datasets.get
bigquery.datasets.update
bigquery.admin
bigquery.dataOwner
Dataset update permissions
ALTER TABLE
ADD COLUMN
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Managing table permissions
ALTER TABLE
RENAME TO
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Managing table permissions
ALTER TABLE
SET OPTIONS
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Managing table permissions
ALTER TABLE
DROP COLUMN
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Managing table permissions
ALTER VIEW
SET OPTIONS
bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Managing table permissions
DROP EXTERNAL TABLE bigquery.tables.delete
bigquery.tables.get
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Drop table permissions
DROP FUNCTION bigquery.routines.delete bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
DROP MATERIALIZED VIEW bigquery.tables.delete
bigquery.tables.get
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Materialized view permissions
DROP PROCEDURE bigquery.routines.delete
bigquery.routines.get
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
DROP SCHEMA bigquery.datasets.delete
bigquery.tables.delete
*

* Not needed for empty schema.

bigquery.admin
bigquery.dataOwner
Drop schema permissions
DROP TABLE bigquery.tables.delete
bigquery.tables.get
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Drop table permissions
DROP VIEW bigquery.tables.get
bigquery.tables.update
bigquery.admin
bigquery.dataEditor
bigquery.dataOwner
Drop table permissions

Running DDL statements

You can run DDL statements by using the Cloud Console, by using the bq command-line tool, by calling the jobs.query REST API, or programmatically using the BigQuery API client libraries.

Console

  1. Go to the BigQuery page in the Cloud Console.

    Go to BigQuery

  2. Click Compose new query.

    Compose new query.

  3. Enter the DDL statement into the Query editor text area. For example:

                      CREATE TABLE mydataset.newtable ( x INT64 )                
  4. Click Run.

bq

Enter the bq query command and supply the DDL statement as the query parameter. Set the use_legacy_sql flag to false.

bq query --use_legacy_sql=false \   'CREATE TABLE mydataset.newtable ( x INT64 )'            

API

Call the jobs.query method and supply the DDL statement in the request body's query property.

DDL functionality extends the information returned by a Jobs resource. statistics.query.statementType includes the following additional values for DDL support:

  • CREATE_TABLE
  • CREATE_TABLE_AS_SELECT
  • DROP_TABLE
  • CREATE_VIEW
  • DROP_VIEW

statistics.query has 2 additional fields:

  • ddlOperationPerformed: The DDL operation performed, possibly dependent on the existence of the DDL target. Current values include:
    • CREATE: The query created the DDL target.
    • SKIP: No-op. Examples — CREATE TABLE IF NOT EXISTS was submitted, and the table exists. Or DROP TABLE IF EXISTS was submitted, and the table does not exist.
    • REPLACE: The query replaced the DDL target. Example — CREATE OR REPLACE TABLE was submitted, and the table already exists.
    • DROP: The query deleted the DDL target.
  • ddlTargetTable: When you submit a CREATE TABLE/VIEW statement or a DROP TABLE/VIEW statement, the target table is returned as an object with 3 fields:
    • "projectId": string
    • "datasetId": string
    • "tableId": string

Java

Call the BigQuery.create() method to start a query job. Call the Job.waitFor() method to wait for the DDL query to finish.

Node.js

Python

Call the Client.query() method to start a query job. Call the QueryJob.result() method to wait for the DDL query to finish.

CREATE SCHEMA statement

Creates a new dataset.

Syntax

CREATE SCHEMA [ IF NOT EXISTS ] [project_name.]dataset_name          [OPTIONS(schema_option_list)]        

Arguments

  • IF NOT EXISTS: If any dataset exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.
  • project_name: The name of the project where you are creating the dataset. Defaults to the project that runs this DDL statement.

  • dataset_name: The name of the dataset to create.

  • schema_option_list: A list of options for creating the dataset.

Details

The dataset is created in the location that you specify in the query settings. For more information, see Specifying your location.

For more information about creating a dataset, see Creating datasets. For information about quotas, see Dataset limits.

schema_option_list

The option list specifies options for the dataset. Specify the options in the following format: NAME=VALUE, ...

The following options are supported:

NAME VALUE Details
default_kms_key_name STRING Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table.
default_partition_expiration_days FLOAT64 Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table.
default_table_expiration_days FLOAT64 Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table.
description STRING The description of the dataset.
friendly_name STRING A descriptive name for the dataset.
labels <ARRAY<STRUCT<STRING, STRING>>> An array of labels for the dataset, expressed as key-value pairs.
location STRING The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails.

Examples

The following example creates a dataset with a default table expiration and a set of labels.

CREATE SCHEMA mydataset OPTIONS(   location="us",   default_table_expiration_days=3.75,   labels=[("label1","value1"),("label2","value2")]   )        

CREATE TABLE statement

Creates a new table.

Syntax

CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] TABLE [ IF NOT EXISTS ] table_name [(   column[, ...] )] [PARTITION BY partition_expression] [CLUSTER BY clustering_column_list] [OPTIONS(table_option_list)] [AS query_statement]        

Arguments

  • OR REPLACE: Replaces any table with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • TEMP | TEMPORARY: Creates a temporary table.

  • IF NOT EXISTS: If any table exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • table_name: The name of the table to create. See Table path syntax. For temporary tables, do not include the project name or dataset name.

  • column: The table's schema information.

  • partition_expression: An expression that determines how to partition the table.

  • clustering_column_list: A comma-separated list of column references that determine how to cluster the table.

  • table_option_list: A list of options for creating the table.

  • query_statement: The query from which the table should be created. For the query syntax, see SQL syntax reference.

Details

CREATE TABLE statements must comply with the following rules:

  • Only one CREATE statement is allowed.
  • Either the column list, the as query_statement clause, or both must be present.
  • When both the column list and the as query_statement clause are present, BigQuery ignores the names in the as query_statement clause and matches the columns with the column list by position.
  • When the as query_statement clause is present and the column list is absent, BigQuery determines the column names and types from the as query_statement clause.
  • Column names must be specified either through the column list, the as query_statement clause or schema of the table in the LIKE clause.
  • Duplicate column names are not allowed.
  • When both the LIKE and the as query_statement clause are present, the column list in the query statement must match the columns of the table referenced by the LIKE clause.

Limitations:

  • It is not possible to create an ingestion-time partitioned table from the result of a query. Instead, use a CREATE TABLE DDL statement to create the table, and then use an INSERT DML statement to insert data into it.
  • It is not possible to use the OR REPLACE modifier to replace a table with a different kind of partitioning. Instead, DROP the table, and then use a CREATE TABLE ... AS SELECT ... statement to recreate it.

This statement supports the following variants:

  • CREATE TABLE LIKE: Create a table with the same schema as an existing table.
  • CREATE TABLE COPY: Create a table by copying schema and data from an existing table.

column

(column_name column_schema[, ...]) contains the table's schema information in a comma-separated list.

column :=   column_name column_schema  column_schema :=    {      simple_type [NOT NULL]      | STRUCT<field_list> [NOT NULL]      | ARRAY<array_element_schema>    }    [OPTIONS(column_option_list)]  field_list :=   field_name column_schema [, ...]  array_element_schema :=   { simple_type | STRUCT<field_list> }   [NOT NULL]        
  • column_name is the name of the column. A column name:

    • Must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_)
    • Must start with a letter or underscore
    • Can be up to 300 characters
  • column_schema: Similar to a data type, but supports an optional NOT NULL constraint for types other than ARRAY. column_schema also supports options on top-level columns and STRUCT fields.

    column_schema: Can be used only in the column definition list of CREATE TABLE statements. It cannot be used as a type in expressions. For example, CAST(1 AS INT64 NOT NULL) is not valid.

  • simple_type: Any supported data type aside from STRUCT and ARRAY.

  • field_list: Represents the fields in a struct.

  • field_name: The name of the struct field. Struct field names have the same restrictions as column names.

  • NOT NULL: When the NOT NULL constraint is present for a column or field, the column or field is created with REQUIRED mode. Conversely, when the NOT NULL constraint is absent, the column or field is created with NULLABLE mode.

    Columns and fields of ARRAY type do not support the NOT NULL modifier. For example, a column_schema of ARRAY<INT64> NOT NULL is invalid, since ARRAY columns have REPEATED mode and can be empty but cannot be NULL. An array element in a table can never be NULL, regardless of whether the NOT NULL constraint is specified. For example, ARRAY<INT64> is equivalent to ARRAY<INT64 NOT NULL>.

    The NOT NULL attribute of a table's column_schema does not propagate through queries over the table. If table T contains a column declared as x INT64 NOT NULL, for example, CREATE TABLE dataset.newtable AS SELECT x FROM T creates a table named dataset.newtable in which x is NULLABLE.

partition_expression

PARTITION BY is an optional clause that controls table partitioning. partition_expression is an expression that determines how to partition the table. The partition expression can contain the following values:

  • _PARTITIONDATE. Partition by ingestion time with daily partitions. This syntax cannot be used with the AS query_statement clause.
  • DATE(_PARTITIONTIME). Equivalent to _PARTITIONDATE. This syntax cannot be used with the AS query_statement clause.
  • <date_column>. Partition by a DATE column with daily partitions.
  • DATE({ <timestamp_column> | <datetime_column> }). Partition by a TIMESTAMP or DATETIME column with daily partitions.
  • DATETIME_TRUNC(<datetime_column>, { DAY | HOUR | MONTH | YEAR }). Partition by a DATETIME column with the specified partitioning type.
  • TIMESTAMP_TRUNC(<timestamp_column>, { DAY | HOUR | MONTH | YEAR }). Partition by a TIMESTAMP column with the specified partitioning type.
  • TIMESTAMP_TRUNC(_PARTITIONTIME, { DAY | HOUR | MONTH | YEAR }). Partition by ingestion time with the specified partitioning type. This syntax cannot be used with the AS query_statement clause.
  • DATE_TRUNC(<date_column>, { MONTH | YEAR }). Partition by a DATE column with the specified partitioning type.
  • RANGE_BUCKET(<int64_column>, GENERATE_ARRAY(<start>, <end>[, <interval>])). Partition by an integer column with the specified range, where:

    • start is the start of range partitioning, inclusive.
    • end is the end of range partitioning, exclusive.
    • interval is the width of each range within the partition. Defaults to 1.

clustering_column_list

CLUSTER BY is an optional clause that controls table clustering. clustering_column_list is a comma-separated list that determines how to cluster the table. The clustering column list can contain a list of up to four clustering columns.

table_option_list

The option list allows you to set table options such as a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a table option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

partition_expiration_days

FLOAT64

Example: partition_expiration_days=7

Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions do not expire.

This property is equivalent to the timePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours.

This property can only be set if the table is partitioned.

require_partition_filter

BOOL

Example: require_partition_filter=true

Specifies whether queries on this table must include a a predicate filter that filters on the partitioning column. For more information, see Set partition filter requirements. The default value is false.

This property is equivalent to the timePartitioning.requirePartitionFilter table resource property.

This property can only be set if the table is partitioned.

kms_key_name

STRING

Example: kms_key_name="projects/project_id/locations/ location/keyRings/keyring/cryptoKeys/key"

This property is equivalent to the encryptionConfiguration.kmsKeyName table resource property.

See more details about Protecting data with Cloud KMS keys.

friendly_name

STRING

Example: friendly_name="my_table"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a table that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

VALUE is a constant expression containing only literals, query parameters, and scalar functions. If the constant expression evaluates to null, the corresponding option NAME is ignored.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

column_option_list

The column_option_list in column_schema lets you specify optional column or field options. Column options have the same syntax and requirements as table options but with a different list of NAMEs and VALUEs:

NAME VALUE Details
description

STRING

Example: description="a unique id"

This property is equivalent to the schema.fields[].description table resource property.

Examples

Creating a new table

The following example creates a partitioned table named newtable in mydataset:

CREATE TABLE mydataset.newtable (   x INT64 OPTIONS(description="An optional INTEGER field"),   y STRUCT<     a ARRAY<STRING> OPTIONS(description="A repeated STRING field"),     b BOOL   > ) PARTITION BY _PARTITIONDATE OPTIONS(   expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC",   partition_expiration_days=1,   description="a table that expires in 2025, with each partition living for 24 hours",   labels=[("org_unit", "development")] )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.newtable, your table qualifier might be `myproject.mydataset.newtable`.

If the table name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.table

The table uses the following partition_expression to partition the table: PARTITION BY _PARTITIONDATE. This expression partitions the table using the date in the _PARTITIONDATE pseudo column.

The table schema contains two columns:

  • x: An integer, with description "An optional INTEGER field"
  • y: A STRUCT containing two columns:

    • a: An array of strings, with description "A repeated STRING field"
    • b: A boolean

The table option list specifies the:

  • Table expiration time: January 1, 2025 at 00:00:00 UTC
  • Partition expiration time: 1 day
  • Description: A table that expires in 2025
  • Label: org_unit = development

Creating a new table from an existing table

The following example creates a table named top_words in mydataset from a query:

CREATE TABLE mydataset.top_words OPTIONS(   description="Top ten words per Shakespeare corpus" ) AS SELECT   corpus,   ARRAY_AGG(STRUCT(word, word_count) ORDER BY word_count DESC LIMIT 10) AS top_words FROM bigquery-public-data.samples.shakespeare GROUP BY corpus;        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.top_words, your table qualifier might be `myproject.mydataset.top_words`.

If the table name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.table

The table schema contains 2 columns:

  • corpus: Name of a Shakespeare corpus
  • top_words: An ARRAY of STRUCTs containing 2 fields: word (a STRING) and word_count (an INT64 with the word count)

The table option list specifies the:

  • Description: Top ten words per Shakespeare corpus

Creating a table only if the table doesn't exist

The following example creates a table named newtable in mydataset only if no table named newtable exists in mydataset. If the table name exists in the dataset, no error is returned, and no action is taken.

CREATE TABLE IF NOT EXISTS mydataset.newtable (x INT64, y STRUCT<a ARRAY<STRING>, b BOOL>) OPTIONS(   expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC",   description="a table that expires in 2025",   labels=[("org_unit", "development")] )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.newtable, your table qualifier might be `myproject.mydataset.newtable`.

The table schema contains 2 columns:

  • x: An integer
  • y: A STRUCT containing a (an array of strings) and b (a boolean)

The table option list specifies the:

  • Expiration time: January 1, 2025 at 00:00:00 UTC
  • Description: A table that expires in 2025
  • Label: org_unit = development

Creating or replacing a table

The following example creates a table named newtable in mydataset, and if newtable exists in mydataset, it is overwritten with an empty table.

CREATE OR REPLACE TABLE mydataset.newtable (x INT64, y STRUCT<a ARRAY<STRING>, b BOOL>) OPTIONS(   expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC",   description="a table that expires in 2025",   labels=[("org_unit", "development")] )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.newtable, your table qualifier might be `myproject.mydataset.newtable`.

The table schema contains 2 columns:

  • x: An integer
  • y: A STRUCT containing a (an array of strings) and b (a boolean)

The table option list specifies the:

  • Expiration time: January 1, 2025 at 00:00:00 UTC
  • Description: A table that expires in 2025
  • Label: org_unit = development

Creating a table with REQUIRED columns

The following example creates a table named newtable in mydataset. The NOT NULL modifier in the column definition list of a CREATE TABLE statement specifies that a column or field is created in REQUIRED mode.

CREATE TABLE mydataset.newtable (   x INT64 NOT NULL,   y STRUCT<     a ARRAY<STRING>,     b BOOL NOT NULL,     c FLOAT64   > NOT NULL,   z STRING )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.newtable, your table qualifier might be `myproject.mydataset.newtable`.

If the table name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.table

The table schema contains 3 columns:

  • x: A REQUIRED integer
  • y: A REQUIRED STRUCT containing a (an array of strings), b (a REQUIRED boolean), and c (a NULLABLE float)
  • z: A NULLABLE string

Creating a table with parameterized data types

The following example creates a table named newtable in mydataset. The parameters in parentheses specify that the column contains a parameterized data type. See Parameterized Data Types for more information about parameterized types.

CREATE TABLE mydataset.newtable (   x STRING(10),   y STRUCT<     a ARRAY<BYTES(5)>,     b NUMERIC(15, 2),     c FLOAT64   >,   z BIGNUMERIC(35) )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. Instead of mydataset.newtable, your table qualifier should be `myproject.mydataset.newtable`.

If the table name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.table

The table schema contains 3 columns:

  • x: A parameterized string with a maximum length of 10
  • y: A STRUCT containing a (an array of parameterized bytes with a maximum length of 5), b (a parameterized NUMERIC with a maximum precision of 15 and maximum scale of 2), and c (a float)
  • z: A parameterized BIGNUMERIC with a maximum precision of 35 and maximum scale of 0

Creating a partitioned table

The following example creates a partitioned table named newtable in mydataset using a DATE column:

CREATE TABLE mydataset.newtable (transaction_id INT64, transaction_date DATE) PARTITION BY transaction_date OPTIONS(   partition_expiration_days=3,   description="a table partitioned by transaction_date" )        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.newtable, your table qualifier might be `myproject.mydataset.newtable`.

The table schema contains 2 columns:

  • transaction_id: An integer
  • transaction_date: A date

The table option list specifies the:

  • Partition expiration: Three days
  • Description: A table partitioned by transaction_date

Creating a partitioned table from the result of a query

The following example creates a partitioned table named days_with_rain in mydataset using a DATE column:

CREATE TABLE mydataset.days_with_rain PARTITION BY date OPTIONS (   partition_expiration_days=365,   description="weather stations with precipitation, partitioned by day" ) AS SELECT   DATE(CAST(year AS INT64), CAST(mo AS INT64), CAST(da AS INT64)) AS date,   (SELECT ANY_VALUE(name) FROM `bigquery-public-data.noaa_gsod.stations` AS stations    WHERE stations.usaf = stn) AS station_name,  -- Stations can have multiple names   prcp FROM `bigquery-public-data.noaa_gsod.gsod2017` AS weather WHERE prcp != 99.9  -- Filter unknown values   AND prcp > 0      -- Filter stations/days with no precipitation        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.days_with_rain, your table qualifier might be `myproject.mydataset.days_with_rain`.

The table schema contains 2 columns:

  • date: The DATE of data collection
  • station_name: The name of the weather station as a STRING
  • prcp: The amount of precipitation in inches as a FLOAT64

The table option list specifies the:

  • Partition expiration: One year
  • Description: Weather stations with precipitation, partitioned by day

Creating a clustered table

Example 1

The following example creates a clustered table named myclusteredtable in mydataset. The table is a partitioned table, partitioned by a TIMESTAMP column and clustered by a STRING column named customer_id.

CREATE TABLE mydataset.myclusteredtable (   timestamp TIMESTAMP,   customer_id STRING,   transaction_amount NUMERIC ) PARTITION BY DATE(timestamp) CLUSTER BY customer_id OPTIONS (   partition_expiration_days=3,   description="a table clustered by customer_id" )

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.myclusteredtable, your table qualifier might be `myproject.mydataset.myclusteredtable`.

The table schema contains 3 columns:

  • timestamp: The time of data collection as a TIMESTAMP
  • customer_id: The customer ID as a STRING
  • transaction_amount: The transaction amount as NUMERIC

The table option list specifies the:

  • Partition expiration: 3 days
  • Description: A table clustered by customer_id
Example 2

The following example creates a clustered table named myclusteredtable in mydataset. The table is an ingestion-time partitioned table.

CREATE TABLE mydataset.myclusteredtable (   customer_id STRING,   transaction_amount NUMERIC ) PARTITION BY DATE(_PARTITIONTIME) CLUSTER BY   customer_id OPTIONS (   partition_expiration_days=3,   description="a table clustered by customer_id" )

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.myclusteredtable, your table qualifier might be `myproject.mydataset.myclusteredtable`.

The table schema contains 2 columns:

  • customer_id: The customer ID as a STRING
  • transaction_amount: The transaction amount as NUMERIC

The table option list specifies the:

  • Partition expiration: 3 days
  • Description: A table clustered by customer_id
Example 3

The following example creates a clustered table named myclusteredtable in mydataset. The table is not partitioned.

CREATE TABLE mydataset.myclusteredtable (   customer_id STRING,   transaction_amount NUMERIC ) CLUSTER BY   customer_id OPTIONS (   description="a table clustered by customer_id" )

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.myclusteredtable, your table qualifier might be `myproject.mydataset.myclusteredtable`.

The table schema contains 2 columns:

  • customer_id: The customer ID as a STRING
  • transaction_amount: The transaction amount as NUMERIC

The table option list specifies the:

  • Description: A table clustered by customer_id

Creating a clustered table from the result of a query

Example 1

The following example creates a clustered table named myclusteredtable in mydataset using the result of a query. The table is a partitioned table, partitioned by a TIMESTAMP column.

CREATE TABLE mydataset.myclusteredtable (   timestamp TIMESTAMP,   customer_id STRING,   transaction_amount NUMERIC ) PARTITION BY DATE(timestamp) CLUSTER BY   customer_id OPTIONS (   partition_expiration_days=3,   description="a table clustered by customer_id" ) AS SELECT * FROM mydataset.myothertable        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.myclusteredtable, your table qualifier might be `myproject.mydataset.myclusteredtable`.

The table schema contains 3 columns:

  • timestamp: The time of data collection as a TIMESTAMP
  • customer_id: The customer ID as a STRING
  • transaction_amount: The transaction amount as NUMERIC

The table option list specifies the:

  • Partition expiration: 3 days
  • Description: A table clustered by customer_id
Example 2

The following example creates a clustered table named myclusteredtable in mydataset using the result of a query. The table is not partitioned.

CREATE TABLE mydataset.myclusteredtable (   customer_id STRING,   transaction_amount NUMERIC ) CLUSTER BY   customer_id OPTIONS (   description="a table clustered by customer_id" ) AS SELECT * FROM mydataset.myothertable        

If you haven't configured a default project, prepend a project ID to the dataset name in the example SQL, and enclose the name in backticks if project_id contains special characters: `project_id.dataset.table`. So, instead of mydataset.myclusteredtable, your table qualifier might be `myproject.mydataset.myclusteredtable`.

The table schema contains 2 columns:

  • customer_id: The customer ID as a STRING
  • transaction_amount: The transaction amount as NUMERIC

The table option list specifies the:

  • Description: A table clustered by customer_id

Creating a temporary table

The following example creates a temporary table named Example and inserts values into it.

CREATE          TEMP          TABLE Example (   x INT64,   y STRING );  INSERT INTO Example VALUES (5, 'foo');  INSERT INTO Example VALUES (6, 'bar');  SELECT * FROM Example;        

This script returns the following output:

          +-----+---+-----+ | Row | x | y   | +-----+---|-----+ | 1   | 5 | foo | | 2   | 6 | bar | +-----+---|-----+                  

CREATE TABLE LIKE statement

Creates a new table with all of the same metadata of another table.

Syntax

CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] table_name LIKE [[project_name.]dataset_name.]source_table_name [PARTITION BY partition_expression] [CLUSTER BY clustering_column_list] [OPTIONS(table_option_list)] [AS query_statement]        

Details

Other than the use of the LIKE clause in place of a column list, the syntax is identical to the CREATE TABLE syntax.

The CREATE TABLE LIKE statement copies only the metadata of the source table. You can use the as query_statement clause to include data into the new table.

The new table has no relationship to the source table after creation; thus modifications to the source table will not propagate to the new table.

By default, the new table inherits partitioning, clustering, and options metadata from the source table. You can customize metadata in the new table by using the optional clauses in the SQL statement. For example, if you want to specify a different set of options for the new table, then include the OPTIONS clause with a list of options and values. This behavior match that of ALTER TABLE SET OPTIONS.

Examples

Example 1

The following example creates a new table named newtable in mydataset with the same metadata as sourcetable:

CREATE TABLE mydataset.newtable LIKE mydataset.newtable        

Example 2

The following example creates a new table named newtable in mydataset with the same metadata as sourcetable and the data from the SELECT statement:

CREATE TABLE mydataset.newtable LIKE mydataset.soucetable AS SELECT * FROM mydataset.myothertable        

CREATE TABLE COPY statement

Create a new table with the same metadata and data from another table.

Syntax

CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] table_name COPY source_table_name [OPTIONS(table_option_list)]        

Details

Other than the use of the COPY clause in place of a column list, the syntax is identical to the CREATE TABLE syntax.

The CREATE TABLE COPY statement copies both the metadata and data from the source table.

The new table has no relationship to the source table after creation; thus modifications to the source table will not propagate to the new table.

The new table inherits partitioning and clustering from the source table. By default, the table options metadata from the source table are also inherited. Though you can customize table options in the new table by using the OPTIONS clauses in the SQL statement. The behavior is equivalent to running ALTER TABLE SET OPTIONS after the table has been copied.

CREATE SNAPSHOT TABLE statement

Creates a new table snapshot of a standard table or makes a copy of a table snapshot.

Syntax

CREATE SNAPSHOT TABLE [ IF NOT EXISTS ] table_snapshot_name CLONE source_table_name [FOR SYSTEM_TIME AS OF time_expression] [OPTIONS(snapshot_option_list)]        

Arguments

  • IF NOT EXISTS: If a table snapshot or other table resource exists with the same name, the CREATE statement has no effect.

  • table_snapshot_name: The name of the table snapshot that you want to create. The table snapshot name must be unique per dataset. See Table path syntax.

  • source_table_name: The name of the table that you want to snapshot or the table snapshot that you want to copy. See Table path syntax.

    If the source table is a standard table, then BigQuery creates a table snapshot of the source table. If the source table is a table snapshot, then BigQuery creates a copy of the table snapshot.

  • FOR SYSTEM_TIME AS OF: Lets you select the version of the table that was current at the time specified by timestamp_expression. It can only be used when creating a snapshot of a table; it can't be used when making a copy of a table snapshot.

  • snapshot_option_list: Additional table snapshot creation options such as a label and an expiration time.

Details

CREATE SNAPSHOT TABLE statements must comply with the following rules:

  • Only one CREATE statement is allowed.
  • The table that's being cloned must be one of the following:
    • A standard table (not a view or a materialized view)
    • A table snapshot
  • The FOR SYSTEM_TIME AS OF clause can only be used when creating a snapshot of a table; it can't be used when making a copy of a table snapshot.

snapshot_option_list

The option list lets you set table snapshot options such as a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a table snapshot option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

friendly_name

STRING

Example: friendly_name="my_table_snapshot"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="A table snapshot that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

VALUE is a constant expression that contains only literals, query parameters, and scalar functions. If the constant expression evaluates to null, then the corresponding option NAME is ignored.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

Examples

Create a table snapshot: fail if it already exists

The following example creates a table snapshot of the table myproject.mydataset.mytable. The table snapshot is created in the dataset mydataset and is named mytablesnapshot:

CREATE SNAPSHOT TABLE `myproject.mydataset.mytablesnapshot` CLONE `myproject.mydataset.mytable` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="my_table_snapshot",   description="A table snapshot that expires in 2 days",   labels=[("org_unit", "development")] )        

If the table snapshot name already exists in the dataset, then the following error is returned:

Already Exists: myproject.mydataset.mytablesnapshot

The table snapshot option list specifies the following:

  • Expiration time: 48 hours after the time the table snapshot is created
  • Friendly name: my_table_snapshot
  • Description: A table snapshot that expires in 2 days
  • Label: org_unit = development

Create a table snapshot: ignore if it already exists

The following example creates a table snapshot of the table myproject.mydataset.mytable. The table snapshot is created in the dataset mydataset and is named mytablesnapshot:

CREATE SNAPSHOT TABLE IF NOT EXISTS `myproject.mydataset.mytablesnapshot` CLONE `myproject.mydataset.mytable` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="my_table_snapshot",   description="A table snapshot that expires in 2 days"   labels=[("org_unit", "development")] )        

The table snapshot option list specifies the following:

  • Expiration time: 48 hours after the time the table snapshot is created
  • Friendly name: my_table_snapshot
  • Description: A table snapshot that expires in 2 days
  • Label: org_unit = development

If the table snapshot name already exists in the dataset, then no action is taken, and no error is returned.

For information about restoring table snapshots, see CREATE TABLE CLONE.

For information about removing table snapshots, see DROP SNAPSHOT TABLE.

CREATE TABLE CLONE statement

Restores a table snapshot to a standard table.

Syntax

CREATE [ OR REPLACE ] TABLE [ IF NOT EXISTS ] destination_table_name CLONE table_snapshot_name [OPTIONS(table_option_list)]        

Arguments

  • OR REPLACE: Replaces a table with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If the specified destination table name already exists, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • destination_table_name: The name of the table that you want to create. See Table path syntax.The table name must be unique per dataset.

  • table_snapshot_name: The name of the table snapshot that you want to restore. See Table path syntax.

  • table_option_list: Additional table creation options such as a label and an expiration time.

Details

CREATE TABLE CLONE statements must comply with the following rules:

  • Only one CREATE statement is allowed.
  • The table that's being cloned must be a table snapshot.

OPTIONS

CREATE TABLE CLONE options are the same as CREATE TABLE options.

Examples

Restore a table snapshot: fail if destination table already exists

The following example creates the table myproject.mydataset.mytable from the table snapshot myproject.mydataset.mytablesnapshot:

CREATE TABLE `myproject.mydataset.mytable` CLONE `myproject.mydataset.mytablesnapshot` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 365 DAY),   friendly_name="my_table",   description="A table that expires in 1 year",   labels=[("org_unit", "development")] )        

If the table name exists in the dataset, then the following error is returned:

Already Exists: myproject.mydataset.mytable.

The table option list specifies the following:

  • Expiration time: 365 days after the time that the table is created
  • Friendly name: my_table
  • Description: A table that expires in 1 year
  • Label: org_unit = development

Restore a table snapshot: ignore if destination table already exists

The following example creates the table myproject.mydataset.mytable from the table snapshot myproject.mydataset.mytableshapshot:

CREATE TABLE IF NOT EXISTS `myproject.mydataset.mytable` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 365 DAY),   friendly_name="my_table",   description="A table that expires in 1 year",   labels=[("org_unit", "development")] ) CLONE `myproject.mydataset.mytablesnapshot`        

The table option list specifies the following:

  • Expiration time: 365 days after the time the table is created
  • Friendly name: my_table
  • Description: A table that expires in 1 year
  • Label: org_unit = development

If the table name exists in the dataset, then no action is taken, and no error is returned.

For information about creating table snapshots, see CREATE SNAPSHOT TABLE.

For information about removing table snapshots, see DROP SNAPSHOT TABLE.

CREATE VIEW statement

Creates a new view.

Syntax

CREATE [ OR REPLACE ] VIEW [ IF NOT EXISTS ] view_name [(view_column_name_list)] [OPTIONS(view_option_list)] AS query_expression        

Arguments

  • OR REPLACE: Replaces any view with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If a view or other table resource exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • view_name: The name of the view you're creating. See Table path syntax.

  • view_column_name_list: Lets you explicitly specify the column names of the view, which may be aliases to the column names in the underlying SQL query.

  • view_option_list: Additional view creation options such as a label and an expiration time.

  • query_expression: The Standard SQL query expression used to define the view.

Details

CREATE VIEW statements must comply with the following rules:

  • Only one CREATE statement is allowed.

view_column_name_list

The view's column name list is optional. The names must be unique but do not have to be the same as the column names of the underlying SQL query. For example, if your view is created with the following statement:

          CREATE VIEW mydataset.age_groups(age, count) AS SELECT age, COUNT(*) FROM mydataset.people group by age;                  

Then you can query it with:

          SELECT age, count from mydataset.age_groups;                  

The number of columns in the column name list must match the number of columns in the underlying SQL query. If the columns in the table of the underlying SQL query is added or dropped, the view becomes invalid and must be recreated. For example, if the age column is dropped from the mydataset.people table, then the view created in the previous example becomes invalid.

view_option_list

The option list allows you to set view options such as a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a view option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

friendly_name

STRING

Example: friendly_name="my_view"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a view that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

VALUE is a constant expression containing only literals, query parameters, and scalar functions. If the constant expression evaluates to null, the corresponding option NAME is ignored.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

Default project in view body

If the view is created in the same project used to run the CREATE VIEW statement, the view body query_expression can reference entities without specifying the project; the default project is the project which owns the view. Consider the sample query below.

          CREATE VIEW myProject.myDataset.myView AS SELECT * FROM anotherDataset.myTable;                  

After running the above CREATE VIEW query in the project myProject, you can run the query SELECT * FROM myProject.myDataset.myView. Regardless of the project you choose to run this SELECT query, the referenced table anotherDataset.myTable is always resolved against project myProject.

If the view is not created in the same project used to run the CREATE VIEW statement, then all references in the view body query_expression must be qualified with project IDs. For instance, the preceding sample CREATE VIEW query is invalid if it runs in a project different from myProject.

Examples

Creating a new view

The following example creates a view named newview in mydataset:

CREATE VIEW `myproject.mydataset.newview` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="newview",   description="a view that expires in 2 days",   labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`        

If the view name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.table

The view is defined using the following standard SQL query:

SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`

The view option list specifies the:

  • Expiration time: 48 hours from the time the view is created
  • Friendly name: newview
  • Description: A view that expires in 2 days
  • Label: org_unit = development

Creating a view only if the view doesn't exist

The following example creates a view named newview in mydataset only if no view named newview exists in mydataset. If the view name exists in the dataset, no error is returned, and no action is taken.

CREATE VIEW IF NOT EXISTS `myproject.mydataset.newview` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="newview",   description="a view that expires in 2 days",   labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`        

The view is defined using the following standard SQL query:

SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`

The view option list specifies the:

  • Expiration time: 48 hours from the time the view is created
  • Friendly name: newview
  • Description: A view that expires in 2 days
  • Label: org_unit = development

Creating or replacing a view

The following example creates a view named newview in mydataset, and if newview exists in mydataset, it is overwritten using the specified query expression.

CREATE OR REPLACE VIEW `myproject.mydataset.newview` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="newview",   description="a view that expires in 2 days",   labels=[("org_unit", "development")] ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`        

The view is defined using the following standard SQL query:

SELECT column_1, column_2, column_3 FROM myproject.mydataset.mytable

The view option list specifies the:

  • Expiration time: 48 hours from the time the view is created
  • Friendly name: newview
  • Description: A view that expires in 2 days
  • Label: org_unit = development

CREATE MATERIALIZED VIEW statement

Creates a new materialized view.

Syntax

CREATE MATERIALIZED VIEW [ IF NOT EXISTS ] materialized_view_name [PARTITION BY partition_expression] [CLUSTER BY clustering_column_list] [OPTIONS(materialized_view_option_list)] AS query_expression        

Arguments

  • OR REPLACE: Replaces any materialized view with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If a materialized view or other table resource exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • materialized_view_name: The name of the materialized view you're creating. See Table path syntax.

    If the project_name is omitted from the materialized view name, or it is the same as the project that runs this DDL query, then the latter is also used as the default project for references to tables, functions, and other resources in query_expression. The default project of the references is fixed and does not depend on the future queries that invoke the new materialized view. Otherwise, all references in query_expression must be qualified with project names.

    The materialized view name must be unique per dataset.

  • partition_expression: An expression that determines how to partition the table. A materialized view can only be partitioned in the same way as the table in query expression (the base table) is partitioned.

  • clustering_column_list: A comma-separated list of column references that determine how to cluster the materialized view.

  • materialized_view_option_list** allows you to specify additional materialized view options such as a whether refresh is enabled, the refresh interval, a label, and an expiration time.

  • query_expression: The Standard SQL query expression used to define the materialized view.

Details

CREATE MATERIALIZED VIEW statements must comply with the following rules:

  • Only one CREATE statement is allowed.

Default project in materialized view body

If the materialized view is created in the same project used to run the CREATE MATERIALIZED VIEW statement, the materialized view body query_expression can reference entities without specifying the project; the default project is the project which owns the materialized view. Consider the sample query below.

          CREATE MATERIALIZED VIEW myProject.myDataset.myView AS SELECT * FROM anotherDataset.myTable;                  

After running the above CREATE MATERIALIZED VIEW query in the project myProject, you can run the query SELECT * FROM myProject.myDataset.myView. Regardless of the project you choose to run this SELECT query, the referenced table anotherDataset.myTable is always resolved against project myProject.

If the materialized view is not created in the same project used to run the CREATE VIEW statement, then all references in the materialized view body query_expression must be qualified with project IDs. For instance, the preceding sample CREATE MATERIALIZED VIEW query is invalid if it runs in a project different from myProject.

materialized_view_option_list

The option list allows you to set materialized view options such as a whether refresh is enabled. the refresh interval, a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a materialized view option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
enable_refresh BOOLEAN

Example: enable_refresh=false

refresh_interval_minutes FLOAT64

Example: refresh_interval_minutes=20

expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

friendly_name

STRING

Example: friendly_name="my_mv"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a materialized view that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

Examples

Creating a new materialized view

The following example creates a materialized view named new_mv in mydataset:

CREATE          MATERIALIZED          VIEW `myproject.mydataset.new_mv` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="new_mv",   description="a materialized view that expires in 2 days",   labels=[("org_unit", "development")],   enable_refresh=true,   refresh_interval_minutes=20 ) AS SELECT column_1, SUM(column_2) AS sum_2, AVG(column_3) AS avg_3 FROM `myproject.mydataset.mytable` GROUP BY column_1        

If the materialized view name exists in the dataset, the following error is returned:

Already Exists: project_id:dataset.materialized_view

When you use a DDL statement to create a materialized view, you must specify the project, dataset, and materialized view in the following format: `project_id.dataset.materialized_view` (including the backticks if project_id contains special characters); for example, `myproject.mydataset.new_mv`.

The materialized view is defined using the following standard SQL query:

SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`

The materialized view option list specifies the:

  • Expiration time: 48 hours from the time the materialized view is created
  • Friendly name: new_mv
  • Description: A materialized view that expires in 2 days
  • Label: org_unit = development
  • Refresh enabled: true
  • Refresh interval: 20 minutes

Creating a materialized view only if the materialized view doesn't exist

The following example creates a materialized view named new_mv in mydataset only if no materialized view named new_mv exists in mydataset. If the materialized view name exists in the dataset, no error is returned, and no action is taken.

CREATE          MATERIALIZED          VIEW IF NOT EXISTS `myproject.mydataset.new_mv` OPTIONS(   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 48 HOUR),   friendly_name="new_mv",   description="a view that expires in 2 days",   labels=[("org_unit", "development")],   enable_refresh=false ) AS SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`        

The materialized view is defined using the following standard SQL query:

SELECT column_1, column_2, column_3 FROM `myproject.mydataset.mytable`

The materialized view option list specifies the:

  • Expiration time: 48 hours from the time the view is created
  • Friendly name: new_mv
  • Description: A view that expires in 2 days
  • Label: org_unit = development
  • Refresh enabled: false

Creating a materialized view with partitioning and clustering

The following example creates a materialized view named new_mv in mydataset, partitioned by the col_datetime column and clustered by the col_int column:

CREATE          MATERIALIZED          VIEW `myproject.mydataset.new_mv` PARTITION BY DATE(col_datetime) CLUSTER BY col_int AS SELECT col_int, col_datetime, COUNT(1) as cnt    FROM `myproject.mydataset.mv_base_table`    GROUP BY col_int, col_datetime        

The base table, mv_base_table, must also be partitioned by the col_datetime column. For more information, see Working with partitioned and clustered tables.

CREATE EXTERNAL TABLE statement

Creates a new external table.

External tables let BigQuery query data that is stored outside of BigQuery storage. For more information about external tables, see Introduction to external data sources.

Syntax

CREATE [ OR REPLACE ] EXTERNAL TABLE [ IF NOT EXISTS ] table_name [(   column_name column_schema,   ... )] [WITH CONNECTION connection_name] [WITH PARTITION COLUMNS   [(       partition_column_name partition_column_type,       ...   )] ] OPTIONS (   external_table_option_list,   ... );        

Arguments

  • OR REPLACE: Replaces any external table with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If an external table or other table resource exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • table_name: The name of the external table. See Table path syntax.

  • column_name: The name of a column in the table.

  • column_schema: Specifies the schema of the column. It uses the same syntax as the column_schema definition in the CREATE TABLE statement. If you don't include this clause, BigQuery detects the schema automatically.

  • connection_name: Specifies a connection resource that has credentials for accessing the external data. Specify the connection name in the form PROJECT_ID.LOCATION.CONNECTION_ID. If the project ID or location contains a dash, enclose the connection name in backticks (`).

  • partition_column_name: The name of a partition column. Include this field if your external data uses a hive-partitioned layout. For more information, see: Supported data layouts.

  • partition_column_type: The partition column type.

  • external_table_option_list: A list of options for creating the external table.

Details

The CREATE EXTERNAL TABLE statement does not support creating temporary external tables.

To create an externally partitioned table, use the WITH PARTITION COLUMNS clause to specify the partition schema details. BigQuery validates the column definitions against the external data location. The schema declaration must strictly follow the ordering of the fields in the external path. For more information about external partitioning, see Querying externally partitioned data.

external_table_option_list

The option list specifies options for creating the external table. The format and uris options are required. Specify the option list in the following format: NAME=VALUE, ...

Options
allow_jagged_rows

BOOL

If true, allow rows that are missing trailing optional columns.

Applies to CSV data.

allow_quoted_newlines

BOOL

If true, allow quoted data sections that contain newline characters in the file.

Applies to CSV data.

compression

STRING

The compression type of the data source. Supported values include: GZIP. If not specified, the data source is uncompressed.

Applies to CSV and JSON data.

enable_logical_types

BOOL

If true, convert Avro logical types into their corresponding SQL types. For more information, see Logical types.

Applies to Avro data.

encoding

STRING

The character encoding of the data. Supported values include: UTF8 (or UTF-8), ISO_8859_1 (or ISO-8859-1).

Applies to CSV data.

field_delimiter

STRING

The separator for fields in a CSV file.

Applies to CSV data.

format

STRING

The format of the external data. Supported values include: AVRO, CSV, DATASTORE_BACKUP, GOOGLE_SHEETS, NEWLINE_DELIMITED_JSON (or JSON), ORC, PARQUET.

The value JSON is equivalent to NEWLINE_DELIMITED_JSON.

decimal_target_types

ARRAY<STRING>

Determines how to convert a Decimal type. Equivalent to ExternalDataConfiguration.decimal_target_types

Example: ["NUMERIC", "BIGNUMERIC"].

json_extension

STRING

For JSON data, indicates a particular JSON interchange format. If not specified, BigQuery reads the data as generic JSON records.

Supported values include:
GEOJSON (Preview). GeoJSON data. For more information, see Loading GeoJSON data.

hive_partition_uri_prefix

STRING

A common prefix for all source URIs before the partition key encoding begins. Applies only to hive-partitioned external tables.

Applies to Avro, CSV, JSON, Parquet, and ORC data.

Example: "gs://bucket/path".

ignore_unknown_values

BOOL

If true, ignore extra values that are not represented in the table schema, without returning an error.

Applies to CSV and JSON data.

max_bad_records

INT64

The maximum number of bad records to ignore when reading the data.

Applies to: CSV, JSON, and Sheets data.

null_marker

STRING

The string that represents NULL values in a CSV file.

Applies to CSV data.

projection_fields

STRING

A list of entity properties to load.

Applies to Datastore data.

quote

STRING

The string used to quote data sections in a CSV file. If your data contains quoted newline characters, also set the allow_quoted_newlines property to true.

Applies to CSV data.

require_hive_partition_filter

BOOL

If true, all queries over this table require a partition filter that can be used to eliminate partitions when reading data. Applies only to hive-partitioned external tables.

Applies to Avro, CSV, JSON, Parquet, and ORC data.

sheet_range

STRING

Range of a Sheets spreadsheet to query from.

Applies to Sheets data.

Example: "sheet1!A1:B20",

skip_leading_rows

INT64

The number of rows at the top of a file to skip when reading the data.

Applies to CSV and Sheets data.

uris

ARRAY<STRING>

An array of fully qualified URIs for the external data locations.

Example: ["gs://bucket/path/*"].

Examples

The following example creates an external table from multiple URIs. The data format is CSV. This example uses schema auto-detection.

          CREATE EXTERNAL TABLE dataset.CsvTable OPTIONS (   format = 'CSV',   uris = ['gs://bucket/path1.csv', 'gs://bucket/path2.csv'] );                  

The following example creates an external table from a CSV file and explicitly specifies the schema. It also specifies the field delimeter ('|') and sets the maximum number of bad records allowed.

          CREATE OR REPLACE EXTERNAL TABLE dataset.CsvTable (   x INT64,   y STRING ) OPTIONS (   format = 'CSV',   uris = ['gs://bucket/path1.csv'],   field_delimiter = '|',   max_bad_records = 5 );                  

The following example creates an externally partitioned table. It uses schema auto-detection to detect both the file schema and the hive partitioning layout.

For example, if the external path is gs://bucket/path/field_1=first/field_2=1/data.csv, the partition columns would be field_1 (STRING) and field_2 (INT64).

          CREATE EXTERNAL TABLE dataset.AutoHivePartitionedTable WITH PARTITION COLUMNS OPTIONS (   uris=['gs://bucket/path/*'],   format=csv,   hive_partition_uri_prefix='gs://bucket/path' );                  

The following example creates an externally partitioned table by explicitly specifying the partition columns. This example assumes that the external file path has the pattern gs://bucket/path/field_1=first/field_2=1/data.csv.

          CREATE EXTERNAL TABLE dataset.CustomHivePartitionedTable WITH PARTITION COLUMNS (   field_1 STRING, -- column order must match the external path   field_2 INT64 ) OPTIONS (   uris=['gs://bucket/path/*'],   format=csv,   hive_partition_uri_prefix='gs://bucket/path' );                  

CREATE FUNCTION statement

Creates a new user-defined function (UDF). BigQuery supports UDFs written in either SQL or JavaScript.

Syntax

To create a SQL UDF, use the following syntax:

CREATE [ OR          REPLACE          ] [          TEMPORARY          |          TEMP          ] FUNCTION [ IF NOT EXISTS ]     [[project_name.]dataset_name.]function_name          ([named_parameter[, ...]])      ([named_parameter[, ...]])   [RETURNS          data_type]   AS (sql_expression)          named_parameter:          param_name param_type        

To create a JavaScript UDF, use the following syntax:

CREATE [OR          REPLACE] [TEMPORARY          |          TEMP] FUNCTION [IF NOT EXISTS]     [[project_name.]dataset_name.]function_name          ([named_parameter[, ...]])          RETURNS          data_type          [determinism_specifier]          LANGUAGE          js   [OPTIONS          (function_option_list)]   AS          javascript_code          named_parameter:          param_name param_type          determinism_specifier:          {          DETERMINISTIC          |          NOT DETERMINISTIC          }        

Routine names must contain only letters, numbers, and underscores, and be at most 256 characters long.

Arguments

  • OR REPLACE: Replaces any function with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If any dataset exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • TEMP or TEMPORARY: Creates a temporary function. If the clause is not present, the statement creates a persistent UDF. You can reuse persistent UDFs across multiple queries, whereas you can only use temporary UDFs in a single query, script, or procedure.

  • project_name. For persistent functions, the name of the project where you are creating the function. Defaults to the project that runs the DDL query. Do not include the project name for temporary functions.

  • dataset_name. For persistent functions, the name of the dataset where you are creating the function. Defaults to the defaultDataset in the request. Do not include the dataset name for temporary functions.

  • function_name. The name of the function.

  • named_parameter: A comma-separated param_name and param_type pair. The value of param_type is a BigQuery data type. For a SQL UDF, the value of param_type can also be ANY TYPE.

  • determinism_specifier: Applies only to JavaScript UDFs. Provides a hint to BigQuery as to whether the query result can be cached. Can be one of the following values:

    • DETERMINISTIC: The function always returns the same result when passed the same arguments. The query result is potentially cacheable. For example, if the function add_one(i) always returns i + 1, the function is deterministic.

    • NOT DETERMINISTIC: The function does not always return the same result when passed the same arguments, and therefore is not cacheable. For example, if the functionj add_random(i) returns i + rand(), the function is not deterministic and BigQuery does not use cached results.

      If all of the invoked functions are DETERMINISTIC, BigQuery tries to cache the result, unless the results can't be cached for other reasons. For more information, see Using cached query results.

  • data_type: The data type that the function returns.

    • If the function is defined in SQL, then the RETURNS clause is optional. If the RETURNS clause is omitted, then BigQuery infers the result type of the function from the SQL function body when a query calls the function.
    • If the function is defined in JavaScript, then the RETURNS clause is required. For more information about allowed values for data_type, see Supported JavaScript UDF data types.
  • sql_expression: The SQL expression that defines the function.

  • function_option_list. A list of options for creating the function. Applies only to JavaScript UDFs.

  • javascript_code: The definition of a JavaScript function. The value is a string literal. If the code includes quotes and backslashes, it must be either escaped or represented as a raw string. For example, the code return "\n"; can be represented as one of the following:

    • Quoted string"return \"\\n\";". Both quotes and backslashes need to be escaped.
    • Triple quoted string: """return "\\n";""". Backslashes need to be escaped while quotes do not.
    • Raw string: r"""return "\n";""". No escaping is needed.

function_option_list

The option list specifies options for creating a UDF. The following options are supported:

NAME VALUE Details
description

STRING

A description of the UDF.
library

ARRAY<STRING>

An array of JavaScript libraries to include in the function definition. Applies only to JavaScript UDFs. For more information, see Including JavaScript libraries.

Example: ["gs://my-bucket/lib1.js", "gs://my-bucket/lib2.js"]

Examples

Create a SQL UDF

The following example creates a persistent SQL UDF named multiplyInputs in a dataset named mydataset.

          CREATE FUNCTION mydataset.multiplyInputs(x FLOAT64, y FLOAT64) RETURNS FLOAT64 AS (x * y);                  

Create a JavaScript UDF

The following example creates a temporary JavaScript UDF named multiplyInputs and calls it from inside a SELECT statement.

          CREATE TEMP FUNCTION multiplyInputs(x FLOAT64, y FLOAT64) RETURNS FLOAT64 LANGUAGE js AS r"""   return x*y; """;   SELECT multiplyInputs(a, b) FROM (SELECT 3 as a, 2 as b);                  

CREATE TABLE FUNCTION statement

Creates a new table function, also called a table-valued function (TVF).

Syntax

CREATE [ OR          REPLACE          ] TABLE FUNCTION [ IF NOT EXISTS ]   [[project_name.]dataset_name.]function_name          ( [          function_parameter          [, ...] ] )   [RETURNS          TABLE <          column_declaration          [, ...] > ]   AS          sql_query          function_parameter:          parameter_name          {          data_type          |          ANY TYPE          }          column_declaration:          column_name          data_type        

Arguments

  • OR REPLACE: Replaces any table function with the same name if it exists. Cannot appear with IF NOT EXISTS.
  • IF NOT EXISTS: If any table function exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.
  • project_name: The name of the project where you are creating the function. Defaults to the project that runs this DDL statement.
  • dataset_name: The name of the dataset where you are creating the function.
  • function_name: The name of the function to create.
  • function_parameter: A parameter for the function, specified as a parameter name and a data type. The value of data_type is a scalar BigQuery data type or ANY TYPE.
  • RETURNS TABLE: The schema of the table that the function returns, specified as a comma-separated list of column name and data type pairs. If RETURNS TABLE is absent, BigQuery infers the output schema from the query statement in the function body. If RETURNS TABLE is included, the names in the returned table type must match column names from the SQL query.
  • sql_query: Specifies the SQL query to run. The SQL query must include names for all columns.

Details

BigQuery coerces argument types when possible. For example, if the parameter type is FLOAT64 and you pass an INT64 value, then BigQuery coerces it to a FLOAT64.

If a parameter type is ANY TYPE, the function accepts an input of any type for this argument. The type that you pass to the function must be compatible with the function definition. If you pass an argument with an incompatible type, the query returns an error. If more than one parameter has type ANY TYPE, BigQuery does not enforce any type relationship between them.

Examples

The following table function takes an INT64 parameter that is used to filter the results of a query:

CREATE OR          REPLACE          TABLE FUNCTION mydataset.names_by_year(y INT64) AS   SELECT year, name, SUM(number) AS total   FROM `bigquery-public-data.usa_names.usa_1910_current`   WHERE year = y   GROUP BY year, name        

The following example specifies the return TABLE type in the RETURNS clause:

CREATE OR          REPLACE          TABLE FUNCTION mydataset.names_by_year(y INT64)          RETURNS          TABLE<name STRING, year INT64, total INT64> AS   SELECT year, name, SUM(number) AS total   FROM `bigquery-public-data.usa_names.usa_1910_current`   WHERE year = y   GROUP BY year, name        

CREATE PROCEDURE statement

Creates a new procedure, which is a block of statements that can be called from other queries.

Syntax

CREATE [OR REPLACE] PROCEDURE [IF NOT EXISTS] [[project_name.]dataset_name.]procedure_name          (procedure_argument[, ...] ) [OPTIONS(procedure_option_list)] BEGIN          statement_list          END;          procedure_argument: [procedure_argument_mode]          argument_name          argument_type          
procedure_argument_mode: IN | OUT | INOUT

Arguments

  • OR REPLACE: Replaces any procedure with the same name if it exists. Cannot appear with IF NOT EXISTS.

  • IF NOT EXISTS: If any procedure exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • project_name**: The name of the project where you are creating the procedure. Defaults to the project that runs this DDL query. If the project name contains special characters such as colons, it should be quoted in backticks ` (example: `google.com:my_project`).

  • dataset_name: The name of the dataset where you are creating the procedure. Defaults to the defaultDataset in the request.

  • procedure_name: The name of the procedure to create.

  • statement_list: A BigQuery statement list. A statement list is a series of statements that each end with a semicolon. Procedures can call themselves recursively.

  • argument_type: Any valid BigQuery type.

  • procedure_argument_mode: Specifies whether an argument is an input, an output, or both.

procedure_option_list

The procedure_option_list lets you specify procedure options. Procedure options have the same syntax and requirements as table options but with a different list of NAMEs and VALUEs:

NAME VALUE Details
strict_mode

BOOL

Example: strict_mode=FALSE

If strict_mode is TRUE, the procedure body will undergo additional checks for errors such as non-existent tables or columns. The CREATE PROCEDURE statement will fail if the body fails any of these checks.

While strict_mode is useful for catching many common types of errors, it is not exhaustive, and successful creation of a procedure with strict_mode does not guarantee that the procedure will successfully execute at runtime.

If strict_mode is FALSE, the procedure body is checked only for syntax. Procedures which invoke themselves recursively should be created with strict_mode=FALSE to avoid errors caused by the procedure not yet existing while it is being validated.

Default value is TRUE.

Argument mode

IN indicates that the argument is only an input to the procedure. You can specify either a variable or a value expression for IN arguments.

OUT indicates that the argument is an output of the procedure. An OUT argument is initialized to NULL when the procedure starts. You must specify a variable for OUT arguments.

INOUT indicates that the argument is both an input to and an output from the procedure. You must specify a variable for INOUT arguments. An INOUT argument can be referenced in the body of a procedure as a variable and assigned new values.

If neither IN, OUT, nor INOUT is specified, the argument is treated as an IN argument.

Variable scope

If a variable is declared outside a procedure, passed as an INOUT or OUT argument to a procedure, and the procedure assigns a new value to that variable, that new value is visible outside of the procedure.

Variables declared in a procedure are not visible outside of the procedure, and vice versa.

An OUT or INOUT argument can be assigned a value using SET, in which case the modified value is visible outside of the procedure. If the procedure exits successfully, then the value of the OUT or INOUT argument is the final value assigned to that INOUT variable.

Temporary tables exist for the duration of the script, so if a procedure creates a temporary table, the caller of the procedure will be able to reference the temporary table as well.

Default project in procedure body

Procedure bodies can reference entities without specifying the project; the default project is the project which owns the procedure, not necessarily the project used to run the CREATE PROCEDURE statement. Consider the sample query below.

          CREATE PROCEDURE myProject.myDataset.QueryTable() BEGIN   SELECT * FROM anotherDataset.myTable; END;                  

After creating the above procedure, you can run the query CALL myProject.myDataset.QueryTable(). Regardless of the project you choose to run this CALL query, the referenced table anotherDataset.myTable is always resolved against project myProject.

Examples

The following example creates a procedure that both takes x as an input argument and returns x as output; because no argument mode is present for the argument delta, it is an input argument. The procedure consists of a block containing a single statement, which assigns the sum of the two input arguments to x.

          CREATE PROCEDURE mydataset.AddDelta(INOUT x INT64, delta INT64) BEGIN   SET x = x + delta; END;                  

The following example calls the AddDelta procedure from the example above, passing it the variable accumulator both times; because the changes to x within AddDelta are visible outside of AddDelta, these procedure calls increment accumulator by a total of 8.

          DECLARE accumulator INT64 DEFAULT 0; CALL mydataset.AddDelta(accumulator, 5); CALL mydataset.AddDelta(accumulator, 3); SELECT accumulator;                  

This returns the following:

          +-------------+ | accumulator | +-------------+ |           8 | +-------------+                  

The following example creates the procedure SelectFromTablesAndAppend, which takes target_date as an input argument and returns rows_added as an output. The procedure creates a temporary table DataForTargetDate from a query; then, it calculates the number of rows in DataForTargetDate and assigns the result to rows_added. Next, it inserts a new row into TargetTable, passing the value of target_date as one of the column names. Finally, it drops the table DataForTargetDate and returns rows_added.

          CREATE PROCEDURE mydataset.SelectFromTablesAndAppend(   target_date DATE, OUT rows_added INT64) BEGIN   CREATE TEMP TABLE DataForTargetDate AS   SELECT t1.id, t1.x, t2.y   FROM dataset.partitioned_table1 AS t1   JOIN dataset.partitioned_table2 AS t2   ON t1.id = t2.id   WHERE t1.date = target_date     AND t2.date = target_date;    SET rows_added = (SELECT COUNT(*) FROM DataForTargetDate);    SELECT id, x, y, target_date  -- note that target_date is a parameter   FROM DataForTargetDate;    DROP TABLE DataForTargetDate; END;                  

The following example declares a variable rows_added, then passes it as an argument to the SelectFromTablesAndAppend procedure from the previous example, along with the value of CURRENT_DATE; then it returns a message stating how many rows were added.

          DECLARE rows_added INT64; CALL mydataset.SelectFromTablesAndAppend(CURRENT_DATE(), rows_added); SELECT FORMAT('Added %d rows', rows_added);                  

CREATE ROW ACCESS POLICY statement

Creates a new row-level access policy. Row-level access policies on a table must have unique names.

Syntax

          CREATE [ OR REPLACE ] ROW ACCESS POLICY [ IF NOT EXISTS ] row_access_policy_name ON table_name [GRANT TO (grantee_list)] FILTER USING (filter_expression);                  

Arguments

  • IF NOT EXISTS: If any row-level access policy exists with the same name, the CREATE statement has no effect. Cannot appear with OR REPLACE.

  • row_access_policy_name: The name of the row-level access policy that you are creating. The row-level access policy name must be unique for each table. The row-level access policy name can contain the following:

    • Up to 256 characters.
    • Letters (upper or lowercase), numbers, and underscores. Must start with a letter.
  • table_name: The name of the table that you want to create a row-level access policy for. The table must already exist.

  • GRANT TO grantee_list: An optional clause that specifies the initial members that the row-level access policy should be created with.

    grantee_list is a list of iam_member users or groups. Strings must be valid IAM principals, or members, following the format of an IAM Policy Binding member, and must be quoted. The following types are supported:

    grantee_list types
    user:{emailid}

    An email address that represents a specific Google account.

    Example: user:alice@example.com

    serviceAccount:{emailid}

    An email address that represents a service account.

    Example: serviceAccount:my-other-app@appspot.gserviceaccount.com

    group:{emailid}

    An email address that represents a Google group.

    Example: group:admins@example.com

    domain:{domain}

    The Google Workspace domain (primary) that represents all the users of that domain.

    Example: domain:example.com

    allAuthenticatedUsers A special identifier that represents all service accounts and all users on the internet who have authenticated with a Google Account. This identifier includes accounts that aren't connected to a Google Workspace or Cloud Identity domain, such as personal Gmail accounts. Users who aren't authenticated, such as anonymous visitors, aren't included.
    allUsers A special identifier that represents anyone who is on the internet, including authenticated and unauthenticated users. Because BigQuery requires authentication before a user can access the service, allUsers includes only authenticated users.

    You can combine a series of iam_member values, if they are comma-separated and quoted separately. For example: "user:alice@example.com","group:admins@example.com","user:sales@example.com"

  • filter_expression: Defines the subset of table rows to show only to the members of the grantee_list. The filter_expression is similar to the WHERE clause in a SELECT query.

    The following functions are valid to use in the filter expression:

    • BigQuery standard SQL scalar functions, aggregate functions, analytic functions.
    • SESSION_USER(), to restrict access only to rows that belong to the user running the query. If none of the row-level access policies are applicable to the querying user, then the user has no access to the data in the table.

    The filter expression cannot contain the following:

    • A reference to a table.
    • Subqueries, or SQL statements such as SELECT, CREATE, or UPDATE.
    • User-defined functions.

Examples

Creating a row access policy, and then modifying the grantees later

                      CREATE ROW ACCESS POLICY My_apac_filter    ON project.dataset.My_table    GRANT TO ("user:abc@example.com")    FILTER USING (region = "apac");                  
                      CREATE OR REPLACE ROW ACCESS POLICY My_apac_filter    ON project.dataset.My_table    GRANT TO ("user:xyz@example.com")    FILTER USING (region = "apac");                  

Creating a row access policy with multiple grantees

                      CREATE ROW ACCESS POLICY My_us_filter    ON project.dataset.My_table    GRANT TO ("user:john@example.com", "group:sales-us@example.com", "group:sales-managers@example.com")    FILTER USING (region = "us");                  

Creating a row access policy with allAuthenticatedUsers as the grantees

                      CREATE ROW ACCESS POLICY My_us_filter    ON project.dataset.My_table    GRANT TO ("allAuthenticatedUsers")    FILTER USING (region = "us");                  

Creating a row access policy with a filter based on the current user

                      CREATE ROW ACCESS POLICY My_row_filter    ON dataset.My_table    GRANT TO ("domain:example.com")    FILTER USING (email = SESSION_USER());                  

Creating a row access policy with a filter on a column with an ARRAY type

                      CREATE ROW ACCESS POLICY My_reports_filter    ON project.dataset.My_table    GRANT TO ("domain:example.com")    FILTER USING (SESSION_USER() IN UNNEST(reporting_chain));                  

ALTER SCHEMA SET OPTIONS statement

Sets options on a dataset.

The statement runs in the location of the dataset if the dataset exists, unless you specify the location in the query settings. For more information, see Specifying your location.

Syntax

ALTER SCHEMA [IF EXISTS] [project_name.]dataset_name          SET OPTIONS(schema_set_options_list)        

Arguments

  • IF EXISTS: If no dataset exists with that name, the statement has no effect.

  • project_name: The name of the project that contains the dataset. Defaults to the project that runs this DDL statement.

  • dataset_name: The name of the dataset.

  • schema_set_options_list: The list of options to set.

schema_set_options_list

The option list specifies options for the dataset. Specify the options in the following format: NAME=VALUE, ...

The following options are supported:

NAME VALUE Details
default_kms_key_name STRING Specifies the default Cloud KMS key for encrypting table data in this dataset. You can override this value when you create a table.
default_partition_expiration_days FLOAT64 Specifies the default expiration time, in days, for table partitions in this dataset. You can override this value when you create a table.
default_table_expiration_days FLOAT64 Specifies the default expiration time, in days, for tables in this dataset. You can override this value when you create a table.
description STRING The description of the dataset.
friendly_name STRING A descriptive name for the dataset.
labels <ARRAY<STRUCT<STRING, STRING>>> An array of labels for the dataset, expressed as key-value pairs.
location STRING The location in which to create the dataset. If you don't specify this option, the dataset is created in the location where the query runs. If you specify this option and also explicitly set the location for the query job, the two values must match; otherwise the query fails.

Example

The following example sets the default table expiration.

ALTER SCHEMA mydataset SET OPTIONS(   default_table_expiration_days=3.75   )        

ALTER TABLE SET OPTIONS statement

Sets the options on a table.

Syntax

ALTER TABLE [IF EXISTS] table_name SET OPTIONS(table_set_options_list)        

Arguments

  • IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to alter. See Table path syntax.

  • table_set_options_list: The list of options to set.

Details

This statement is not supported for external tables.

table_set_options_list

The option list allows you to set table options such as a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a table option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

partition_expiration_days

FLOAT64

Example: partition_expiration_days=7

Sets the partition expiration in days. For more information, see Set the partition expiration. By default, partitions do not expire.

This property is equivalent to the timePartitioning.expirationMs table resource property but uses days instead of milliseconds. One day is equivalent to 86400000 milliseconds, or 24 hours.

This property can only be set if the table is partitioned.

require_partition_filter

BOOL

Example: require_partition_filter=true

Specifies whether queries on this table must include a a predicate filter that filters on the partitioning column. For more information, see Set partition filter requirements. The default value is false.

This property is equivalent to the timePartitioning.requirePartitionFilter table resource property.

This property can only be set if the table is partitioned.

kms_key_name

STRING

Example: kms_key_name="projects/project_id/locations/ location/keyRings/keyring/cryptoKeys/key"

This property is equivalent to the encryptionConfiguration.kmsKeyName table resource property.

See more details about Protecting data with Cloud KMS keys.

friendly_name

STRING

Example: friendly_name="my_table"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a table that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

VALUE is a constant expression containing only literals, query parameters, and scalar functions. If the constant expression evaluates to null, the corresponding option NAME is ignored.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

Setting the VALUE replaces the existing value of that option for the table, if there was one. Setting the VALUE to NULL clears the table's value for that option.

Examples

Setting the expiration timestamp and description on a table

The following example sets the expiration timestamp on a table to seven days from the execution time of the ALTER TABLE statement, and sets the description as well:

ALTER TABLE mydataset.mytable SET OPTIONS (   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY),   description="Table that expires seven days from now" )        

Setting the require partition filter attribute on a partitioned table

The following example sets the timePartitioning.requirePartitionFilter attribute on a partitioned table:

ALTER TABLE mydataset.mypartitionedtable SET OPTIONS (require_partition_filter=true)        

Queries that reference this table must use a filter on the partitioning column, or else BigQuery returns an error. Setting this option to true can help prevent mistakes in querying more data than intended.

Clearing the expiration timestamp on a table

The following example clears the expiration timestamp on a table so that it will not expire:

ALTER TABLE mydataset.mytable SET OPTIONS (expiration_timestamp=NULL)        

ALTER TABLE ADD COLUMN statement

Adds one or more new columns to an existing table schema.

Syntax

          ALTER TABLE table_name ADD COLUMN [IF NOT EXISTS] column_name column_schema [, ...]                  

Arguments

  • table_name: The name of the table. See Table path syntax.

  • IF EXISTS: If the column name already exists, the statement has no effect.

  • column_name: The name of the column to add.

  • column_schema: The schema of the column. This schema uses the same syntax as the column schema for the CREATE TABLE statement.

Details

You cannot use this statement to create:

  • Partitioned columns.
  • Clustered columns.
  • Nested columns inside existing RECORD fields.

You cannot add a REQUIRED column to an existing table schema. However, you can create a nested REQUIRED column as part of a new RECORD field.

This statement is not supported for external tables.

Without the IF NOT EXISTS clause, if the table already contains a column with that name, the statement returns an error. If the IF NOT EXISTS clause is included and the column name already exists, no error is returned, and no action is taken.

The value of the new column for existing rows is set to one of the following:

  • NULL if the new column was added with NULLABLE mode. This is the default mode.
  • An empty ARRAY if the new column was added with REPEATED mode.

For more information about schema modifications in BigQuery, see Modifying table schemas.

Examples

Adding columns

The following example adds the following columns to an existing table named mytable:

  • Column A of type STRING.
  • Column B of type GEOGRAPHY.
  • Column C of type NUMERIC with REPEATED mode.
  • Column D of type DATE with a description.
          ALTER TABLE mydataset.mytable   ADD COLUMN A STRING,   ADD COLUMN IF NOT EXISTS B GEOGRAPHY,   ADD COLUMN C ARRAY<NUMERIC>,   ADD COLUMN D DATE OPTIONS(description="my description")                  

If any of the columns named A, C, or D already exist, the statement fails. If column B already exists, the statement succeeds because of the IF NOT EXISTS clause.

Adding a RECORD column

The following example adds a column named A of type STRUCT that contains the following nested columns:

  • Column B of type GEOGRAPHY.
  • Column C of type INT64 with REPEATED mode.
  • Column D of type INT64 with REQUIRED mode.
  • Column E of type TIMESTAMP with a description.
          ALTER TABLE mydataset.mytable    ADD COLUMN A STRUCT<        B GEOGRAPHY,        C ARRAY<INT64>,        D INT64 NOT NULL,        E TIMESTAMP OPTIONS(description="creation time")        >                  

The query fails if the table already has a column named A, even if that column does not contain any of the nested columns that are specified.

The new STRUCT named A is nullable, but the nested column D within A is required for any STRUCT values of A.

ALTER TABLE RENAME TO statement

Renames a table.

Syntax

ALTER TABLE [IF EXISTS] table_name RENAME TO new_table_name        

Arguments

  • IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to rename. See Table path syntax.

  • new_table_name: The new name of the table. The new name cannot be an existing table name.

Details

  • This statement is not supported for external tables.
  • If you change table policies or row-level access policies when you rename the table, then those changes might not be effective.
  • If you want to rename a table that has data streaming into it, you must stop the streaming and wait for BigQuery to indicate that streaming is not in use.

Examples

Renaming a table

The following example renames the table mydataset.mytable to mydataset.mynewtable:

ALTER TABLE mydataset.mytable RENAME TO mynewtable        

ALTER TABLE DROP COLUMN statement

Drops one or more columns from an existing table schema.

Syntax

          ALTER TABLE table_name DROP COLUMN [IF EXISTS] column_name [, ...]                  

Arguments

  • table_name: The name of the table to alter. See Table path syntax. The table must already exist and have a schema.

  • IF EXISTS: If the specified column does not exist, the statement has no effect.

  • column_name: The name of the column to drop.

Details

The statement does not immediately free up the storage that's associated with the dropped column. Storage is claimed in the background over the period of 7 days from the day that a column is dropped.

For information about immediately reclaiming storage, see Deleting a column from a table schema.

You cannot use this statement to drop the following:

  • Partitioned columns
  • Clustered columns
  • Nested columns inside existing RECORD fields

This statement is not supported for external tables.

Without the IF EXISTS clause, if the table does not contain a column with that name, then the statement returns an error. If the IF EXISTS clause is included and the column name does not exist, then no error is returned, and no action is taken.

This statement only removes the column from the table. Any objects that refer to the column, such as views or materialized views, must be updated or recreated separately.

For more information about schema modifications in BigQuery, see Modifying table schemas.

Examples

Dropping columns

The following example drops the following columns from an existing table named mytable:

  • Column A
  • Column B
          ALTER TABLE mydataset.mytable   DROP COLUMN A,   DROP COLUMN IF EXISTS B                  

If the column named A does not exist, then the statement fails. If column B does not exist, then the statement still succeeds because of the IF EXISTS clause.

ALTER COLUMN SET OPTIONS statement

Sets options, such as the column description, on a column in a table in BigQuery.

Syntax

ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column_name SET OPTIONS(column_set_options_list)        

Arguments

  • (ALTER TABLE) IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to alter. See Table path syntax.

  • (ALTER COLUMN) IF EXISTS: If the specified column does not exist, the statement has no effect.

  • column_name: The name of the top level column you're altering. Modifying subfields, such as nested columns in a STRUCT, is not supported.

  • column_set_options_list: The list of options to set on the column.

Details

This statement is not supported for external tables.

column_set_options_list

Specify a column option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
description

STRING

Example: description="a table that expires in 2025"

VALUE is a constant expression containing only literals, query parameters, and scalar functions. If the constant expression evaluates to null, the corresponding option NAME is removed.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

Setting the VALUE replaces the existing value of that option for the column, if there was one. Setting the VALUE to NULL clears the column's value for that option.

Examples

The following example sets a new description on a column called price:

ALTER TABLE mydataset.mytable ALTER COLUMN price SET OPTIONS (   description="Price per unit" )        

ALTER COLUMN DROP NOT NULL statement

Removes a NOT NULL constraint from a column in a table in BigQuery.

Syntax

ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column DROP NOT NULL        

Arguments

  • (ALTER TABLE) IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to alter. See Table path syntax.

  • (ALTER COLUMN) IF EXISTS: If the specified column does not exist, the statement has no effect.

  • column_name: The name of the top level column you're altering. Modifying subfields is not supported.

Details

If a column does not have a NOT NULL constraint the query returns an error.

This statement is not supported for external tables.

Examples

The following example removes the NOT NULL constraint from a column called mycolumn:

ALTER TABLE mydataset.mytable ALTER COLUMN mycolumn DROP NOT NULL        

ALTER COLUMN SET DATA TYPE statement

Changes the data type of a column in a table in BigQuery to a less restrictive data type. For example, a NUMERIC data type can be changed to a BIGNUMERIC type but not the reverse.

Syntax

ALTER TABLE [IF EXISTS] table_name ALTER COLUMN [IF EXISTS] column_name SET DATA TYPE data_type        

Arguments

  • (ALTER TABLE) IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to alter. See Table path syntax.

  • column_name: The name of the top level column you're altering. Modifying subfields is not supported.

  • data_type: The type that you're converting the column to.

Details

For a table of valid data type coercions, compare the "From Type" column to the "Coercion To" column in the Conversion rules in Standard SQL page.

The following are examples of valid data type coercions:

  • INT64 to NUMERIC, BIGNUMERIC, FLOAT64
  • NUMERIC to BIGNUMERIC, FLOAT64

This statement is not supported for external tables.

Without the IF EXISTS clause, if the table does not contain a column with that name, the statement returns an error. If the IF EXISTS clause is included and the column name does not exist, no error is returned, and no action is taken.

You can also coerce data types from more restrictive to less restrictive parameterized data types. For example, you can increase the maximum length of a string type or increase the precision or scale of a numeric type.

The following are examples of valid parameterized data type changes:

  • NUMERIC(6,10) to NUMERIC(8,12)
  • NUMERIC to BIGNUMERIC(40, 20)
  • STRING(5) to STRING(7)

Examples

The following example changes the data type of column c1 from an INT64 to NUMERIC:

CREATE TABLE dataset.table(c1 INT64);  ALTER TABLE dataset.table ALTER COLUMN c1 SET DATA TYPE NUMERIC;        

The following example changes the data type of one of the fields in the s1 column:

CREATE TABLE dataset.table(s1 STRUCT<a INT64, b STRING>);  ALTER TABLE dataset.table ALTER COLUMN s1 SET DATA TYPE STRUCT<a NUMERIC, b STRING>;        

The following example changes the precision of a parameterized data type column:

CREATE TABLE dataset.table (pt NUMERIC(7,2));  ALTER TABLE dataset.table ALTER COLUMN pt SET DATA TYPE NUMERIC(8,2);        

ALTER VIEW SET OPTIONS statement

Sets the options on a view.

Syntax

ALTER VIEW [IF EXISTS] view_name SET OPTIONS(view_set_options_list)        

Arguments

  • IF EXISTS: If no view exists with that name, the statement has no effect.

  • view_name: The name of the view to alter. See Table path syntax.

  • view_set_options_list: The list of options to set.

view_set_options_list

The option list allows you to set view options such as a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a view option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

friendly_name

STRING

Example: friendly_name="my_view"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a view that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

VALUE is a constant expression containing only literals, query parameters, and scalar functions. If the constant expression evaluates to null, the corresponding option NAME is ignored.

The constant expression cannot contain:

  • A reference to a table
  • Subqueries or SQL statements such as SELECT, CREATE, and UPDATE
  • User-defined functions, aggregate functions, or analytic functions
  • The following scalar functions:
    • ARRAY_TO_STRING
    • REPLACE
    • REGEXP_REPLACE
    • RAND
    • FORMAT
    • LPAD
    • RPAD
    • REPEAT
    • SESSION_USER
    • GENERATE_ARRAY
    • GENERATE_DATE_ARRAY

Setting the VALUE replaces the existing value of that option for the view, if there was one. Setting the VALUE to NULL clears the view's value for that option.

Examples

Setting the expiration timestamp and description on a view

The following example sets the expiration timestamp on a view to seven days from the execution time of the ALTER VIEW statement, and sets the description as well:

ALTER VIEW mydataset.myview SET OPTIONS (   expiration_timestamp=TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 7 DAY),   description="View that expires seven days from now" )        

ALTER MATERIALIZED VIEW SET OPTIONS statement

Sets the options on a materialized view.

Syntax

ALTER MATERIALIZED VIEW [IF EXISTS] materialized_view_name SET OPTIONS(materialized_view_set_options_list)        

Arguments

  • IF EXISTS: If no materialized view exists with that name, the statement has no effect.

  • materialized_view_name: The name of the materialized view to alter. See Table path syntax.

  • materialized_view_set_options_list: The list of options to set.

materialized_view_set_options_list

The option list allows you to set materialized view options such as a whether refresh is enabled. the refresh interval, a label and an expiration time. You can include multiple options using a comma-separated list.

Specify a materialized view option list in the following format:

NAME=VALUE, ...

NAME and VALUE must be one of the following combinations:

NAME VALUE Details
enable_refresh BOOLEAN

Example: enable_refresh=false

refresh_interval_minutes FLOAT64

Example: refresh_interval_minutes=20

expiration_timestamp TIMESTAMP

Example: expiration_timestamp=TIMESTAMP "2025-01-01 00:00:00 UTC"

This property is equivalent to the expirationTime table resource property.

friendly_name

STRING

Example: friendly_name="my_mv"

This property is equivalent to the friendlyName table resource property.

description

STRING

Example: description="a materialized view that expires in 2025"

This property is equivalent to the description table resource property.

labels

ARRAY<STRUCT<STRING, STRING>>

Example: labels=[("org_unit", "development")]

This property is equivalent to the labels table resource property.

Setting the VALUE replaces the existing value of that option for the materialized view, if there was one. Setting the VALUE to NULL clears the materialized view's value for that option.

Examples

Setting the enable refresh state and refresh interval on a materialized view

The following example enables refresh and sets the refresh interval to 20 minutes on a materialized view:

ALTER          MATERIALIZED          VIEW mydataset.my_mv SET OPTIONS (   enable_refresh=true,   refresh_interval_minutes=20 )        

DROP SCHEMA statement

Deletes a dataset.

Syntax

          DROP SCHEMA [IF EXISTS] [project_name.]dataset_name [ CASCADE | RESTRICT ]                  

Arguments

  • IF EXISTS: If no dataset exists with that name, the statement has no effect.

  • project_name: The name of the project that contains the dataset. Defaults to the project that runs this DDL statement.

  • dataset_name: The name of the dataset to delete.

  • CASCADE: Deletes the dataset and all resources within the dataset, such as tables, views, and functions. You must have permission to delete the resources, or else the statement returns an error. For a list of BigQuery permissions, see Predefined roles and permissions.

  • RESTRICT: Deletes the dataset only if it's empty. Otherwise, returns an error. If you don't specify either CASCADE or RESTRICT, then the default behavior is RESTRICT.

Details

The statement runs in the location of the dataset if it exists, unless you specify the location in the query settings. For more information, see Specifying your location.

Examples

The following example deletes the dataset named mydataset. If the dataset does not exist or is not empty, then the statement returns an error.

          DROP SCHEMA mydataset                  

The following example drops the dataset named mydataset and any resources in that dataset. If the dataset does not exist, then no error is returned.

          DROP SCHEMA IF EXISTS mydataset CASCADE                  

DROP TABLE statement

Deletes a table.

Syntax

          DROP TABLE [IF EXISTS] table_name                  

Arguments

  • IF EXISTS: If no table exists with that name, the statement has no effect.

  • table_name: The name of the table to delete. See Table path syntax.

Examples

Deleting a table

The following example deletes a table named mytable in the mydataset:

DROP TABLE mydataset.mytable        

If the table name does not exist in the dataset, the following error is returned:

Error: Not found: Table myproject:mydataset.mytable

Deleting a table only if the table exists

The following example deletes a table named mytable in mydataset only if the table exists. If the table name does not exist in the dataset, no error is returned, and no action is taken.

DROP TABLE IF EXISTS mydataset.mytable        

DROP SNAPSHOT TABLE statement

Deletes a table snapshot.

Syntax

          DROP SNAPSHOT TABLE [IF EXISTS] table_snapshot_name                  

Arguments

  • IF EXISTS: If no table snapshot exists with that name, the statement has no effect.

  • table_snapshot_name: The name of the table snapshot to delete. See Table path syntax.

Examples

Delete a table snapshot: fail if it doesn't exist

The following example deletes the table snapshot named mytablesnapshot in the mydataset dataset:

DROP SNAPSHOT TABLE mydataset.mytablesnapshot        

If the table snapshot does not exist in the dataset, then the following error is returned:

Error: Not found: Table snapshot myproject:mydataset.mytablesnapshot

Delete a table snapshot: ignore if it doesn't exist

The following example deletes the table snapshot named mytablesnapshot in the mydataset dataset.

DROP SNAPSHOT TABLE IF EXISTS mydataset.mytablesnapshot        

If the table snapshot doesn't exist in the dataset, then no action is taken, and no error is returned.

For information about creating table snapshots, see CREATE SNAPSHOT TABLE.

For information about restoring table snapshots, see CREATE TABLE CLONE.

DROP EXTERNAL TABLE statement

Deletes an external table.

Syntax

          DROP EXTERNAL TABLE [IF EXISTS] table_name                  

Arguments

  • IF EXISTS: If no external table exists with that name, the statement has no effect.

  • table_name: The name of the external table to delete. See Table path syntax.

Details

If table_name exists but is not an external table, the statement returns the following error:

Cannot drop table_name which has type TYPE. An external table was expected.

The DROP EXTERNAL statement only removes the external table definition from BigQuery. The data stored in the external location is not affected.

Examples

The following example drops the external table named external_table from the dataset mydataset. It returns an error if the external table does not exist.

          DROP EXTERNAL TABLE mydataset.external_table                  

The following example drops the external table named external_table from the dataset mydataset. If the external table does not exist, no error is returned.

          DROP EXTERNAL TABLE IF EXISTS mydataset.external_table                  

DROP VIEW statement

Deletes a view.

Syntax

          DROP VIEW [IF EXISTS] view_name                  

Arguments

  • IF EXISTS: If no view exists with that name, the statement has no effect.

  • view_name: The name of the view to delete. See Table path syntax.

Examples

Deleting a view

The following example deletes a view named myview in mydataset:

DROP VIEW mydataset.myview        

If the view name does not exist in the dataset, the following error is returned:

Error: Not found: Table myproject:mydataset.myview

Deleting a view only if the view exists

The following example deletes a view named myview in mydataset only if the view exists. If the view name does not exist in the dataset, no error is returned, and no action is taken.

DROP VIEW IF EXISTS mydataset.myview        

DROP MATERIALIZED VIEW statement

Deletes a materialized view.

Syntax

          DROP MATERIALIZED VIEW [IF EXISTS] mv_name                  

Arguments

  • IF EXISTS: If no materialized view exists with that name, the statement has no effect.

  • mv_name: The name of the materialized view to delete. See Table path syntax.

Examples

Deleting a materialized view

The following example deletes a materialized view named my_mv in mydataset:

DROP          MATERIALIZED          VIEW mydataset.my_mv        

If the materialized view name does not exist in the dataset, the following error is returned:

Error: Not found: Table myproject:mydataset.my_mv

If you are deleting a materialized view in another project, you must specify the project, dataset, and materialized view in the following format: `project_id.dataset.materialized_view` (including the backticks if project_id contains special characters); for example, `myproject.mydataset.my_mv`.

Deleting a materialized view only if it exists

The following example deletes a materialized view named my_mv in mydataset only if the materialized view exists. If the materialized view name does not exist in the dataset, no error is returned, and no action is taken.

DROP          MATERIALIZED          VIEW IF EXISTS mydataset.my_mv        

If you are deleting a materialized view in another project, you must specify the project, dataset, and materialized view in the following format: `project_id.dataset.materialized_view`, (including the backticks if project_id contains special characters); for example, `myproject.mydataset.my_mv`.

DROP FUNCTION statement

Deletes a persistent user-defined function (UDF).

Syntax

          DROP FUNCTION [IF EXISTS] [[project_name.]dataset_name.]function_name                  

Arguments

  • IF EXISTS: If no function exists with that name, the statement has no effect.

  • project_name: The name of the project containing the function to delete. Defaults to the project that runs this DDL query. If the project name contains special characters such as colons, it should be quoted in backticks ` (example: `google.com:my_project`).

  • dataset_name: The name of the dataset containing the function to delete. Defaults to the defaultDataset in the request.

  • function_name: The name of the function you're deleting.

Examples

The following example statement deletes the function parseJsonAsStruct contained in the dataset mydataset.

          DROP FUNCTION mydataset.parseJsonAsStruct;                  

The following example statement deletes the function parseJsonAsStruct from the dataset sample_dataset in the project other_project.

          DROP FUNCTION `other_project`.sample_dataset.parseJsonAsStruct;                  

DROP TABLE FUNCTION

Deletes a table function.

Syntax

DROP TABLE FUNCTION [IF EXISTS] [[project_name.]dataset_name.]function_name        

Arguments

  • IF EXISTS: If no table function exists with this name, the statement has no effect.

  • project_name: The name of the project containing the table function to delete. Defaults to the project that runs this DDL query.

  • dataset_name: The name of the dataset containing the table function to delete.

  • function_name: The name of the table function to delete.

Example

The following example deletes a table function named my_table_function:

          DROP TABLE FUNCTION mydataset.my_table_function;                  

DROP PROCEDURE statement

Deletes a stored procedure.

Syntax

          DROP PROCEDURE [IF EXISTS] [[project_name.]dataset_name.]procedure_name                  

Arguments

  • IF EXISTS: If no procedure exists with that name, the statement has no effect.

  • project_name: The name of the project containing the procedure to delete. Defaults to the project that runs this DDL query. If the project name contains special characters such as colons, it should be quoted in backticks ` (example: `google.com:my_project`).

  • dataset_name: The name of the dataset containing the procedure to delete. Defaults to the defaultDataset in the request.

  • procedure_name: The name of the procedure you're deleting.

Examples

The following example statement deletes the procedure myprocedure contained in the dataset mydataset.

          DROP PROCEDURE mydataset.myProcedure;                  

The following example statement deletes the procedure myProcedure from the dataset sample_dataset in the project other_project.

          DROP PROCEDURE `other-project`.sample_dataset.myprocedure;                  

DROP ROW ACCESS POLICY statement

Deletes a row-level access policy.

Syntax

          DROP [ IF EXISTS ] row_access_policy_name ON table_name;                  
          DROP ALL ROW ACCESS POLICIES ON table_name;                  

Arguments

  • IF EXISTS: If no row-level access policy exists with that name, the statement has no effect.

  • row_access_policy_name: The name of the row-level access policy that you are deleting. Each row-level access policy on a table has a unique name.

  • table_name: The name of the table with the row-level access policy or policies that you want to delete.

Examples

Deleting a row-level access policy from a table

                      DROP ROW ACCESS POLICY My_row_filter ON project.dataset.My_table;                  

Deleting all the row-level access policies from a table

                      DROP ALL ROW ACCESS POLICIES ON project.dataset.My_table;                  

Table path syntax

Use the following syntax when specifying the path of a table resource, including standard tables, views, materialized views, external tables, and table snapshots.

          table_path :=   [[project_name.]dataset_name.]table_name                  
  • project_name: The name of the project that contains the table resource. Defaults to the project that runs the DDL query. If the project name contains special characters such as colons, quote the name in backticks ` (example: `google.com:my_project`).

  • dataset_name: The name of the dataset that contains the table resource. Defaults to the defaultDataset in the request.

  • table_name: The name of the table resource.

When you create a table in BigQuery, the table name must be unique per dataset. The table name can:

  • Contain up to 1,024 characters.
  • Contain Unicode characters in category L (letter), M (mark), N (number), Pc (connector, including underscore), Pd (dash), Zs (space). For more information, see General Category.

For example, the following are all valid table names: table 01, ग्राहक, 00_お客様, étudiant-01.

Some table names and table name prefixes are reserved. If you receive an error saying that your table name or prefix is reserved, then select a different name and try again.

How To Create View In Hive

Source: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language

Posted by: howardextouralke.blogspot.com

0 Response to "How To Create View In Hive"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel