Elevating SEO Analysis with Google Search Console and BigQuery

Google lets you export your raw data from Google Search Console to BigQuery. This is exciting, as it opens up a world of possibilities for you to dive deeper into your search analysis and combine search data with other data points. This article shows some ways it can be used. And since I'm a huge Fast and Furious Fan, I'm using dummy FF data as my sample.

Why BigQuery

BigQuery can enhance your SEO toolkit:

  • Unlimited data retention beyond Google SC’s 16-month limit

  • No row limits — every query, every click, every impression

  • Complex SQL analysis at scale

  • Easy joins with GA4, CRM, or other datasets

BigQuery lets you supercharge your SEO with speed, scale, and flexibility — without blowing up your server.

How to connect GSC to BigQuery?

To set up the Bulk data export for Google Search Console data to transfer to Big Query in Google Cloud, follow the instructions here: https://support.google.com/webmasters/answer/12917675?hl=en

Set up a project in Google cloud. Note down your project id to use in Google Search Console.

If you already have a project and are storing data in Big Query, make sure that the dataset location you set up in the search console is the same as the other datasets you have in Google Cloud and BigQuery. As you are unable to query across locations.

Starting with the Basics

Now that we have everything set up, you will need to run SQL Queries on the data sources to access it.

We will begin with a basic query of the Google Search Console data.

    
SELECT
  query, 
  impressions, 
  clicks, 
FROM your_project.your_dataset.searchdata_url_impression`
   
  

This query gives you data like this:

table of query, impressions and clicks

The problem with this is:

  • The raw data contains “anonymised searches” so it comes up as null in the table

  • It also contains duplicates for different days, different regions, and different devices

Grouping data

So, we need to use a group by:

    
    SELECT 
      query, 
      sum(impressions) as impressions, 
      sum(clicks) as clicks 
    FROM your_project.your_dataset.searchdata_url_impression 
    where query is not null 
    group by query 
    order by impressions desc
    
    

Which gives you this:

Table with query impressions and clicks grouped

This is more useful as we remove nulls, group the queries, and order by impressions. However, you're still not doing more than you can achieve with Google Search Console.

Branded vs Non-Branded Keywords

Let's expand it further to include branded vs. non-branded queries.

    
SELECT
  data_date, 
  query, url, 
  case when regexp_contains(query, "fast & furious") or regexp_contains(query, "fast and furious") or regexp_contains(query, "toretto") then "Branded" else "Non-branded" end branded_vs_non_branded, 
  sum(impressions) as impressions, 
  sum(clicks) as clicks, 
  sum(clicks)/sum(impressions) as CTR, 
  1+round(sum(sum_position)/sum(impressions),2) as avg_position 
FROM your_project.your_dataset.searchdata_url_impression 
where query is not null and not(is_anonymized_discover) 
group by all 
order by data_date desc
    
    

Which gives you this:

Table with date, query, url, branded vs non branded, impressions, clicks, CTR and average position

And with this data, you can show:
Branded vs Non-branded Search Queries Click-through rates and Average position

Branded vs non-branded click through rate

Grouped impressions over time

Grouped impressions over time

Deep dive into all keywords for a URL.

Impressions vs Clicks

You can connect this data with Google Looker Studio to display it in data visualisations.

Bringing Data Together

The beauty of BigQuery is that you can use it to combine with other datasets. Most Google products, like Google Analytics, Google Ads and Google Merchant Centre, allow you to export the raw data to BigQuery. On other platforms, like Screaming Frog or Meta, you can use APIs to get the data in there.

Connecting google search console with Google analytics, Google ads, google merchant center and any other data source
    
SELECT 
  url, 
  STRING_AGG(DISTINCT query) AS queries, 
  SUM(impressions) AS impressions, 
  SUM(clicks) AS clicks, 
  CONCAT(ROUND(SUM(clicks) / SUM(impressions) * 100, 2), "%") AS CTR, 
  1 + ROUND(SUM(sum_position) / SUM(impressions), 2) AS avg_position, 
  SUM(sessions) AS sessions, 
  CONCAT(ROUND(SUM(session_engaged) / SUM(sessions) * 100, 2), "%") AS engagement_rate, 
  ROUND(SUM(total_engagement_time) / SUM(sessions), 2) AS avg_engagement_time, 
  SUM(coronas) AS coronas_drunk, 
  SUM(quarter_miles) AS quarter_miles 
FROM your_project.your_dataset.searchdata_url_impression AS gsc 
LEFT JOIN your_project.your_dataset.ga4_data AS ga4 
  ON gsc.data_date = ga4.date AND gsc.url = ga4.landing_page 
GROUP BY url 
ORDER BY impressions DESC
    
    

This code combines Google Search Console data with Google Analytics data. The GA4 dataset has already been filtered to include only users who came from organic search.

It combines the two datasets on a URL/landing page and date. This allows you to get conversion rates for landing pages of search traffic. Here, I have decided that a FF conversion is drinking a Corona or driving a quarter mile.

The type of join is called a left join, which means you keep everything on the left (search console data) and only include the data that matches in the right table (Google Analytics data), as shown in blue in the diagram below.

Left join google search console data and google analytics data

This query gives you data that looks like this:

table with url, queries, impressions, clicks, sessions, engagement rate, average engagement time, conversions

Which gives you the ability to find insights like:

Landing page CTR vs Engagement rates

CTR vs Engagement rate

Impressions by conversion rates for URLs

Impressions vs conversion rate

Queries by conversions (though needs to be grouped by URL)

Queries by conversions

Now, let's get into some AI

Inside Google Cloud, you can run Python notebooks. This allows you to manipulate the data even more and integrate with Vertex AI, Google Cloud’s AI system that powers Gemini. Here is some example code for determining the intent behind search queries using the Search Console data.

    
from vertexai.preview.generative_models import GenerativeModel
import vertexai
from google.cloud import bigquery
import pandas as pd
import re

vertexai.init(project="project-name", location="us-central1")
model = GenerativeModel("gemini-2.5-flash")


batch_size = 20
## Function to send batch to Gemini
def label_query_intents(queries):
    print("start labelling")
    prompt = f"""
    You are a JSON API.
    Group the following search queries into intent clusters.
    For each group, return:
    - "intent": short label (e.g., "Informational")
    - "description": short summary of user goal
    - "queries": list of original queries

    Return only a JSON array of objects — no explanation or intro text.

    Queries:
    {chr(10).join(f"- {q}" for q in queries)}
    """
    response = model.generate_content(prompt)
    raw_text = response.text.strip()
    clean_json = re.sub(r"^```(?:json)?\s*|\s*```$", "", raw_text, flags=re.MULTILINE)
    try:
        return json.loads(clean_json)
    except Exception as e:
       print(f"An error occurred: {e}")
       print("==============")
       return []

# # Loop through batches and collect labeled queries
results = []

for i in range(0, len(all_queries), batch_size):
    batch = all_queries[i:i+batch_size]
    groups = label_query_intents(batch)

    for group in groups:
        intent = group.get("intent")
        for query in group.get("queries", []):
            results.append({"query": query, "intent": intent})

# # Save to dataframe
df = pd.DataFrame(results)
    
    

This gives you data like this:

Table with query and intent

Which allows you to show:

CTR based on intent

Click through rate on intent

Average position based on intent

Average position based on intent

Landing page sessions based on intent

Landing page sessions based on intent

What's next?

If you want to learn more about Google Search Console and Big Query, check out an online course I have helped to create with a team of talented SEO and BigQuery people. https://bigquery.pro/

Next
Next

A Better Way to Build A Measurement Plan - with a handy template