{"id":43769,"date":"2023-08-30T13:56:46","date_gmt":"2023-08-30T11:56:46","guid":{"rendered":"https:\/\/blog.sheetgo.com\/?p=43769"},"modified":"2025-12-17T18:44:13","modified_gmt":"2025-12-17T17:44:13","slug":"um-guia-passo-a-passo-para-o-particionamento-do-bigquery","status":"publish","type":"post","link":"https:\/\/www.sheetgo.com\/pt\/blog\/data-science\/a-step-by-step-guide-to-bigquery-partitioning\/","title":{"rendered":"Um guia passo a passo para o particionamento do BigQuery"},"content":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221; da_is_popup=&#8221;off&#8221; da_exit_intent=&#8221;off&#8221; da_has_close=&#8221;on&#8221; da_alt_close=&#8221;off&#8221; da_dark_close=&#8221;off&#8221; da_not_modal=&#8221;on&#8221; da_is_singular=&#8221;off&#8221; da_with_loader=&#8221;off&#8221; da_has_shadow=&#8221;on&#8221; da_disable_devices=&#8221;off|off|off&#8221;][et_pb_row _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><a href=\"https:\/\/cloud.google.com\/bigquery\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">Bigquery<\/span><\/a><span style=\"font-weight: 400;\"> partitioning is an effective way to optimize the performance of your queries in Google&#8217;s BigQuery But what exactly is BigQuery, and how can partitioning speed up data analysis?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">BigQuery is a fully-managed, serverless enterprise data warehouse offered as part of the Google Cloud Platform. It allows you to run fast SQL queries using the processing power of Google&#8217;s infrastructure.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, as your data grows over time, query performance can slow down. This is where partitioning comes in. Since it divides your large tables into smaller partitions, partitioning can organize your data and accelerate queries.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Check out this step-by-step guide to BigQuery partitioning to understand how it works and how you can create partitioned tables to optimize your data analytics.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_id=&#8221;works&#8221; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<h2><span style=\"font-weight: 400;\">How BigQuery Partitioning Works<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">In BigQuery, you can break up large tables into smaller, more manageable chunks of data called partitions. The goal is to store related data together in partitions, separated from unrelated data in other partitions. When you query a table, BigQuery only scans the partitions related to that query instead of the entire table.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For example, let&#8217;s say you have a table of financial data spanning 10 years, with 10 million rows overall. With BigQuery partitioning, you could divide this table by year into 10 partitions &#8211; one for each year.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When you query data from a specific year, BigQuery will retrieve data from only one partition, reducing the amount of scanning required for queries.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As a result, BigQuery will return results faster than it would take to scan the entire table. Since you are charged for the number of bytes processed by each query, you would also be able to save money by not scanning irrelevant data.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_id=&#8221;create&#8221; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<h2><span style=\"font-weight: 400;\">How to create partitioned tables in BigQuery<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">To demonstrate how we can create partitioned tables, we will use a dataset containing information about rising search terms in Google Trends.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_image src=&#8221;https:\/\/static.sheetgo.com\/wp-content\/uploads\/2023\/08\/BigQuery-partitioning-1.png&#8221; alt=&#8221;BigQuery partitioning 1<br \/>\n&#8221; title_text=&#8221;BigQuery partitioning 1&#8243; align=&#8221;center&#8221; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; width=&#8221;100%&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][\/et_pb_image][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">I&#8217;ll write a SQL query to filter data from <\/span><span style=\"font-weight: 400;\">the &#8220;rising_terms&#8221; table and return specific columns based on these criteria: the score is over 99 and date is between <\/span><span style=\"font-weight: 400;\">2018-01-01 and 2019-01-01.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Note: if you want to learn how to write basic SQL queries, check out this article on <\/span><a href=\"https:\/\/www.sheetgo.com\/blog\/integrations\/bigquery-syntax\/\"><span style=\"font-weight: 400;\">BigQuery syntax<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here&#8217;s what the query would look like:<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_class=&#8221;spreadsheet-function&#8221; _builder_version=&#8221;4.21.2&#8243; border_width_left=&#8221;4px&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">SELECT term, score, week, rank, country_name<\/span><\/p>\n<p><span style=\"font-weight: 400;\">FROM `bigquery-dataset-396013.Google_trends.rising_terms`\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">WHERE week BETWEEN &#8216;2018-01-01&#8217; AND &#8216;2019-01-01&#8217;\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">AND score &gt; 99<\/span><\/p>\n<p><span style=\"font-weight: 400;\">LIMIT 1000;<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p>Let&#8217;s break it down:<\/p>\n<p><b>SELECT term, score, week, rank, country_name <\/b><span style=\"font-weight: 400;\">&#8211; these are the columns I want BigQuery to return.<\/span><\/p>\n<p><b>FROM `bigquery-dataset-396013.Google_trends.rising_terms`<\/b><span style=\"font-weight: 400;\"> &#8211; this is the table to query the data from.<\/span><\/p>\n<p><b>WHERE week BETWEEN &#8216;2018-01-01&#8217; AND &#8216;2019-01-01&#8217;<\/b><span style=\"font-weight: 400;\"> &#8211; this clause filters the rows to only those where week is between these two dates.<\/span><\/p>\n<p><b>AND score &gt; 99<\/b><span style=\"font-weight: 400;\"> &#8211; this is an additional filter to only include rows where the score is greater than 99.<\/span><\/p>\n<p><b>LIMIT 1000;<\/b><span style=\"font-weight: 400;\"> &#8211; this clause limits the results to a maximum of 1000 rows.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_image src=&#8221;https:\/\/static.sheetgo.com\/wp-content\/uploads\/2023\/08\/BigQuery-partitioning-2.png&#8221; alt=&#8221;BigQuery partitioning 2&#8243; title_text=&#8221;BigQuery partitioning 2&#8243; align=&#8221;center&#8221; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; width=&#8221;100%&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;][\/et_pb_image][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">After executing this query, BigQuery returns the columns and rows that fall into meet these criteria. When I ran the query, it had to scan all the rows to return the data that fall under these criteria.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is where BigQuery partitioning comes in handy. By dividing the table into smaller partitions, BigQuery can scan only the segments that match the criteria I established when I write the query.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In order to partition the table, I&#8217;ll use the following syntax:<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_class=&#8221;spreadsheet-function&#8221; _builder_version=&#8221;4.21.2&#8243; border_width_left=&#8221;4px&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">CREATE TABLE table_name\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">PARTITION BY partition_expression<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">I&#8217;ll write a query to create a partitioned table using a SELECT statement to populate it with data from an existing table. This query creates a new table called rising_terms_by_year in the specified dataset. In addition, it specifies that this new table should be partitioned by the week column. It also selects the specified columns from the existing rising_terms table to pull data from.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_class=&#8221;spreadsheet-function&#8221; _builder_version=&#8221;4.21.2&#8243; border_width_left=&#8221;4px&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">CREATE TABLE `bigquery-dataset-396013.Google_trends.rising_terms_by_year`\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">PARTITION BY week AS<\/span><\/p>\n<p><span style=\"font-weight: 400;\">SELECT term, score, week, rank, country_name<\/span><\/p>\n<p><span style=\"font-weight: 400;\">FROM `bigquery-dataset-396013.Google_trends.rising_terms`<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<p><span style=\"font-weight: 400;\">As you can see, BigQuery has created a partitioned table called &#8220;rising_terms_by_year&#8221;.<\/span><\/p>\n<p>[\/et_pb_text][et_pb_text module_id=&#8221;guide&#8221; _builder_version=&#8221;4.21.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;post_content&#8221;]<\/p>\n<h2><span style=\"font-weight: 400;\">A step-by-step guide to BigQuery partitioning<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">This is how you create partitioned tables in BigQuery. By creating partitioned tables, you can break up large tables into smaller, more manageable chunks of data. This enables BigQuery to return results more qiuckly and efficiently.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you want to learn more about BigQuery, check out this article on <\/span><a href=\"https:\/\/www.sheetgo.com\/blog\/integrations\/how-to-use-bigquery-aggregate-functions\/\"><span style=\"font-weight: 400;\">how to use BigQuery aggregate functions<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][\/et_pb_section]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Bigquery partitioning is an effective way to optimize the performance of your queries in Google&#8217;s BigQuery But what exactly is BigQuery, and how can partitioning speed up data analysis? BigQuery is a fully-managed, serverless enterprise data warehouse offered as part of the Google Cloud Platform. It allows you to run fast SQL queries using the [&hellip;]<\/p>\n","protected":false},"author":42,"featured_media":43842,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_et_pb_use_builder":"on","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[31],"tags":[],"class_list":["post-43769","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/posts\/43769","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/comments?post=43769"}],"version-history":[{"count":0,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/posts\/43769\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/media\/43842"}],"wp:attachment":[{"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/media?parent=43769"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/categories?post=43769"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sheetgo.com\/pt\/wp-json\/wp\/v2\/tags?post=43769"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}