The GA4 update: matching big query data with Google analytics - Landing Page

Continuing on from my GA4 series on BigQuery, today we are going to look at Landing pages. If you want to check out my others, they are:

The landing page is actually one of the easier things to get in GA4 vs universal analytics.

Landing pages

To get the landing page, all we need to do is get the page when the event sesssion_start happens. Getting the max means we would get the value if all other values are null.

Select 
 landing_page,
  COUNT(DISTINCT user_pseudo_id) AS user_count,
  COUNT(DISTINCT concat(user_pseudo_id, session_id)) AS sessions,
From(
  SELECT
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    max((select value.string_value from unnest(event_params) where event_name = "session_start" and key = 'page_location')) as landing_page,
  FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
  where _table_suffix BETWEEN "20210101" and "20210131"
  group by 1,2)
group by 1

I am now adding in all the extras besides just sessions. This was covered in the Acquisition report. One thing that has changed is the GA4 avg engagement time. I’ve noticed that it should only be divided by engaged sessions.

Select 
  landing_page,
  COUNT(DISTINCT user_pseudo_id) AS user_count,
  COUNT(DISTINCT concat(user_pseudo_id, session_id)) AS sessions,
  sum(page_views) as page_views,
  count(distinct case when page_views = 1 then concat(user_pseudo_id, session_id) end)
  /count(distinct concat(user_pseudo_id, session_id)) as session_bounce_rate,
  sum(new_users) as new_users,
  sum(cast(session_engaged as int))/count(distinct concat(user_pseudo_id, session_id)) as session_engagement_rate,
  sum(engagement_time_msec)/1000 #in milliseconds
  /sum(cast(session_engaged as int64)) as ga4_session_duration,
  sum(end_time-start_time)/1000000 #timestamp in microseconds
  /count(distinct concat(user_pseudo_id,session_id)) as ua_session_duration,
  sum(purchases) as purchases,
  sum(purchases) /count(distinct concat(user_pseudo_id, session_id)) as ecomm_conversion_rate,
  sum(revenue) as revenue,
  sum(select_promotion) as select_promotion,
  sum(select_promotion)/count(distinct concat(user_pseudo_id, session_id))  as select_promotion_conversion
From(
  SELECT
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    sum( case when event_name = "first_visit" then 1 end) as new_users,
    max((select value.string_value from unnest(event_params) where event_name = "session_start" and key = 'page_location')) as landing_page,
    count(distinct case when event_name = "page_view" then concat(user_pseudo_id,event_timestamp) end) as page_views,
    max((select value.string_value from unnest(event_params) where key = 'session_engaged')) as session_engaged,
    max((select value.int_value from unnest(event_params) where key = 'engagement_time_msec')) as engagement_time_msec,
    min(event_timestamp) as start_time,
    max(event_timestamp) as end_time,
    sum(distinct case when event_name = "purchase" then 1 end) as purchases,
    sum(ecommerce.purchase_revenue) as revenue,
    sum(distinct case when event_name = "select_promotion" then 1 end) as select_promotion,
  FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
  where _table_suffix BETWEEN "20210101" and "20210131"
  group by 1,2)
group by 1


There you go, short and sweet. Any questions let me know.

Previous
Previous

utm_creative_format and utm_marketing_tactic in GA4 and how to use them

Next
Next

The GA4 update: matching big query data with Google analytics - Acquisition Channel report