Home Product Integrations Customers Company Jobs Resources SQL School Playbook Sign In

Working with UTM Parameters in SQL to Answer Bottom-of-the-Funnel Questions

Parsing UTM Parameters Parsed UTM parameters mean squeaky clean campaign data for analysis. Check out our entire series of #datapointers here.

Is email or social driving more traffic to this landing page?

What ad creative on Facebook are people clicking the most?

Are our ebooks more successful on LinkedIn than on Twitter?

If you’ve ever worked with a marketer to understand campaign and channel effectiveness, I bet you’ve heard questions like these before. I know I’ve asked questions like these.

To help us answer these kinds of questions once traffic rolls in, marketers add a bunch of information to the URLs we distribute. This information comes in the form of UTM parameters.

While you might think of “UTM” as the Ubiquitous Tool of Marketers, it really stands for Urchin Tracking Module. You can thank Google for the ubiquity of UTMs. The company bought the technology way back in 2005 and used it as the foundation for Google Analytics.

Zillions of content and paid advertising folks have talked at length about best practices for using UTM parameters effectively. Most of these articles focus on the setup and what you should see in Google Analytics when you’re doing things properly.

But there’s a ton of power in incorporating campaign information into product usage analysis. To do so, we need to make the leap out of Google Analytics and into SQL. It’s not as scary as it sounds. Many of the functions to manipulate URLs in SQL are similar to common formulas in Excel. (And if SQL is completely new to you, learning SQL can add a bunch of value to you skill set.)

Suddenly we can move beyond the top of the funnel and use UTM parameters as a lens to ask questions like:

What can we learn about product usage based on traffic sources?

Are all marketing leads created equal (yeah, right)?

So we drove a bunch of traffic and signups, but did the product resonate? Did we retain customers from one campaign better than another?

Done well, analysis like this can spark productive conversations to align product folks and marketers.

First, a quick primer on UTM parameter syntax

Some other marketer out there is going to hate me for this (sorry, MoveLoot!), but real quick, go to any website and click on the first ad you see.

MoveLoot Ad UTM

Check out the URL of the landing page:

https://www.moveloot.com/sf?utm_source=facebook&utm_medium=paid&utm_content=bamboo&utm_campaign=San_Francisco_web_SF_SignUps_LAL_2_1839391141`

The parameters begin to the right of the “?”. UTM parameters have five standard dimensions. Google Analytics requires three of them (starred, below), so you’ll see those the most often.

  • utm_medium *
  • utm_source *
  • utm_campaign *
  • utm_term
  • utm_content

When arranged in the order outlined above, you can think of these parameters as the tellers of a story—from broad to specific—about the distribution strategy. MoveLoot is using a paid (medium) ad on Facebook (source) to **drive San Franciscans to **signup (campaign) using a bamboo visual (content).

Term is usually reserved for search campaigns. For AdWords traffic you’ll see parameters that include “utm_term” but not “utm_content”.

UTM parameters have uses beyond ads as well. You’ll see them, for example, when you click links in marketing emails. Most tools designed for distributing content on social media (like Buffer) automatically append UTM parameters to the URL, allowing you to learn more about your social traffic. You can see another example of UTM parameters in the URL below, which was generated by clicking the link in this tweet.

http://www.vox.com/2015/10/22/9598658/benghazi-hillary-clinton-testimony-yelling?utm_campaign=vox&utm_content=chorus&utm_medium=social&utm_source=twitter

How to parse UTM parameters using SQL

Okay, with the structure out of the way, let’s dive into parsing the UTM details out of a URL.

Let’s assume you’re tracking event data that includes the URLs of the pages a user visits during their sessions. Identify the first page of a user’s session. (Hint: if the user came to your site from an ad click or a social media post, the first page’s URL will likely have UTM parameters attached.)

Next, parse the UTM parameters out of the URL strings. With the steps outlined below, you’ll end up with a table that turns the MoveLoot URL above into a row in a table that looks like this:

medium | source   | campaign                                      | content
------ | -------- | --------------------------------------------- | ------
paid   | facebook | `San_Francisco_web_SF_SignUps_LAL_2_1839391141 | bamboo`

Start with this common table expression to get the utm_source.

split AS (
    SELECT url,
           SPLIT_PART(url,'utm_source=',2) AS after_utm
      FROM table_of_urls
)

The SPLIT_PART SQL function here looks for the string 'utm_source=' and then returns everything after that value. In the results, the text in the after_utm column will start with the parameter (e.g. “facebook”) and include everything to its right. The next step is to take only the portion of that string from the beginning to the next UTM parameter—so, until & or until the end of the string.

SELECT url,
       after_utm,
       SPLIT_PART(after_utm,'&',1) AS utm_value
  FROM split

Using the same SPLIT_PART function, you can now split the after_utm value on & and return the first part. If an ampersand doesn’t exist, then the function will return the entire string.

Since both of these queries are simple, combining them into a single function is easy. As a best practice, you can also wrap the function in a CASE statement that checks if the UTM parameter exists at all.

SELECT CASE WHEN url LIKE '%utm_source%' 
            THEN SPLIT_PART(SPLIT_PART(url,'utm_source=',2),'&',1) 
       ELSE NULL END AS utm_value,
       url
  FROM table_of_urls

A note about MySQL

MySQL doesn’t have a function to split strings like PostgreSQL. You can still parse UTM parameters, but the function is longer.

SELECT LEFT(
           SUBSTRING(url,LOCATE('utm_medium=',url) + LENGTH('utm_medium=')),
           LOCATE('&',SUBSTRING(url,LOCATE('utm_medium=',url) + LENGTH('utm_medium='))) - 1
       ) AS utm_value,
       url
  FROM table_of_urls

Looping over multiple parameters

Often, as in the MoveLoot example above, you’ll want to parse out multiple parameters from a single URL. If you have lots of parameters, it’s a pain to write these long functions for each parameter.

Using Mode’s parameters, you can create this loop in SQL, so you can quickly change parameter values without having to edit your query. The query below will generate a table that returns the utm_medium, utm_source, utm_campaign, and utm_content.

{% capture list %}
utm_medium,
utm_source,
utm_campaign,
utm_content
{% endcapture %}

{% assign utm_list = list | strip_newlines | split: ","  %}

SELECT 
  {% for utm in utm_list %}
     CASE WHEN url LIKE '%{{utm}}%' 
          THEN SPLIT_PART(SPLIT_PART(url,'{{utm}}=',2),'&',1) 
     ELSE NULL END AS {{utm}},
  {% endfor %}
       url
  FROM table_of_urls

To change the parameters you want to parse from your URL, add or change the parameters in the list at the top. Before running the query, Mode will loop over each parameter and automatically generate a line of SQL that parses out that parameter and writes it to a column. The resulting query looks like this:

SELECT CASE WHEN url LIKE '%utm_medium%' 
            THEN SPLIT_PART(SPLIT_PART(url,'utm_medium=',2),'&',1) 
       ELSE NULL END AS utm_medium,

       CASE WHEN url LIKE '%utm_source%' 
            THEN SPLIT_PART(SPLIT_PART(url,'utm_source=',2),'&',1) 
       ELSE NULL END AS utm_source,

       CASE WHEN url LIKE '%utm_campaign%' 
            THEN SPLIT_PART(SPLIT_PART(url,'utm_campaign=',2),'&',1) 
       ELSE NULL END AS utm_campaign,

       CASE WHEN url LIKE '%utm_content%' 
            THEN SPLIT_PART(SPLIT_PART(url,'utm_content=',2),'&',1) 
       ELSE NULL END AS utm_content,

       url
  FROM table_of_urls

Here’s an example of this query in action.

Getting to the good stuff: joining campaign data with product data

Now that you have a nice table of the campaign information that drives user sessions, you can join this information with a user’s in-product behaviors to find all kinds of insights you might not be able to see in Google Analytics. You could build cohort analyses based on social traffic, or ad creative, or remarketing messaging. The possibilities are endless!