How to Load CSV File to Bigquery

To load CSV data from Cloud Storage into a new BigQuery table:


For step-by-step guidance on this task directly in Cloud Shell Editor, click Guide me:

Guide me


The following sections take you through the same steps as clicking Guide me.

  1. In the Cloud Console, open the BigQuery page.

    Go to BigQuery

  2. In the Explorer panel, expand your project and select a dataset.

  3. Expand the  Actions option and click Open.

  4. In the details panel, click Create table .

  5. On the Create table page, in the Source section:

    • For Create table from, select Cloud Storage.

    • In the source field, browse to or enter the Cloud Storage URI. Note that you cannot include multiple URIs in the Cloud Console, but wildcards are supported. The Cloud Storage bucket must be in the same location as the dataset that contains the table you're creating.

      Select file

    • For File format, select CSV.

  6. On the Create table page, in the Destination section:

    • For Dataset name, choose the appropriate dataset.

      View dataset

    • Verify that Table type is set to Native table.

    • In the Table name field, enter the name of the table you're creating in BigQuery.

  7. In the Schema section, for Auto detect, check Schema and input parameters to enable schema auto detection. Alternatively, you can manually enter the schema definition by:

    • Enabling Edit as text and entering the table schema as a JSON array.

      Add schema as JSON array

    • Using Add field to manually input the schema.

      Add schema definition using the Add Field button

  8. (Optional) To partition the table, choose your options in the Partition and cluster settings. For more information, see Creating partitioned tables.

  9. (Optional) For Partitioning filter, click the Require partition filter box to require users to include a WHERE clause that specifies the partitions to query. Requiring a partition filter may reduce cost and improve performance. For more information, see Querying partitioned tables. This option is unavailable if No partitioning is selected.

  10. (Optional) To cluster the table, in the Clustering order box, enter between one and four field names.

  11. (Optional) Click Advanced options.

    • For Write preference, leave Write if empty selected. This option creates a new table and loads your data into it.
    • For Number of errors allowed, accept the default value of 0 or enter the maximum number of rows containing errors that can be ignored. If the number of rows with errors exceeds this value, the job will result in an invalid message and fail.
    • For Unknown values, check Ignore unknown values to ignore any values in a row that are not present in the table's schema.
    • For Field delimiter, choose the character that separates the cells in your CSV file: CommaTabPipe, or Custom. If you choose Custom, enter the delimiter in the Custom field delimiter box. The default value is Comma.
    • For Header rows to skip, enter the number of header rows to skip at the top of the CSV file. The default value is 0.
    • For Quoted newlines, check Allow quoted newlines to allow quoted data sections that contain newline characters in a CSV file. The default value is false.
    • For Jagged rows, check Allow jagged rows to accept rows in CSV files that are missing trailing optional columns. The missing values are treated as nulls. If unchecked, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false.
    • For Encryption, click Customer-managed key to use a Cloud Key Management Service key. If you leave the Google-managed key setting, BigQuery encrypts the data at rest.
  12. Click Create table.

For Your Reference : https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv




 

Post a Comment

0 Comments

Youtube Channel Image
TechAdvice Subscribe To watch more Blogging Tutorials
Subscribe