Insights

Deep Dive: Tracking How ChatGPT + Search & Others Send Users To Your Site

Are you trying to better understand how searchers are using ChatGPT, Gemini and other AI platforms to find your websites? We sure are! 

Since it has been a while since our Team wrote about dark traffic on our blog, it's time to re-visit the list of Dark Traffic Sources, because there’s a whole new world of AI-Driven traffic that will likely fall into the Dark Traffic category.

Sources of un-trackable clicks, AKA Dark Traffic:


  • Traffic that goes through a redirect,
  • Tapping on a link in many mobile, even social, apps,
  • Clicking on a link in an email program,
  • Clicking on a link shared in an Instant Messenger, Hangout, Video Chat, etc,
  • Clicking on certain image or mobile searches,
  • Your traffic being deliberately obfuscated,
  • Clicking on an untagged link in a doc like a PDF or other document (gotta tag those links),
  • User clicked through a shortened, untagged, URL,
  • Clicking on a link in any sort of installed software,
  • And now…Clicking on many links that are provided by AI platforms like ChatGPT, Google’s AI Overviews, Claude, or even Google’s new GPT Search extension. 

 

Blue Content Creation Table Plan Diagram (2)

Bookmarks and visitors typing the URL directly into the browser URL bar have been the foundation of Dark Traffic for years. Social traffic has gone extremely dark, check this study from Rand Fishkin. Yet, this is no longer the primary source of Direct traffic. Mobile and desktop apps are large contributors nowadays and as mentioned, AI is entering the fray.

dark-traffic

Are my customers using ChatGPT search to find my website?


What's Changed in Web Traffic Sources Since ChatGPT's Search Launch?

On October 31, 2024, OpenAI released the highly anticipated ChatGPT with Search—a new feature that enables ChatGPT to “Search the web” in real-time, enhancing its responses with up-to-date and accurate information and linking users to external sources. This move to AI is causing us all to get a lot of questions, for basic ways of tracking check out Wil's post where he talks about more basic tracking of ChatGPT.

This new capability introduces questions around tracking and attribution measurement. One key part of maximizing this opportunity (and having accurate data to serve as tangible evidence) involves understanding when, why, and how ChatGPT (shows up) includes UTM parameters and/or referral information to some URLs and not others. 

In this post, we take a deep dive into tracking ChatGPT traffic now that search is included in their answers.


  • How does ChatGPT choose when to include a URL, and how to track it.
  • How to improve the likelihood of ChatGPT returning trackable URLs & UTMs
  • GA4 & ChatGPT referral handling, what you need to know.

How Does ChatGPT Choose URLs, What Are UTM Parameters and Why Should I Care?


Here’s a quick refresher on UTM's for the newbies: UTM parameters are snippets added to URLs that help you track where your site traffic comes from. When a user clicks a link with UTM parameters, these parameters feed attribution data into your analytics platforms (e.g GA4), enabling you to see which sources are driving traffic, engagement, conversions, etc.

How Does ChatGPT Decide Between Citations and Search Results?


Before diving into ChatGPT’s comprehensive logic for when it does and does not include UTM parameters in its results – it’s essential to understand the primary distinction ChatGPT makes in its system programming between citations and search results when including links. 

ChatGPT search result with citations

ChatGPT with Search Citations vs Search Results:


Think of citations as adding credibility: “Citations” in ChatGPT search  refer to sources or references provided to support factual statements or detailed responses in an AI-generated answer. These are links to and/or validation from external content that help verify or add credibility to the information ChatGPT provides.


Think of search results as a wider net of possible answers: Search results in ChatGPT refer to real-time results fetched from the web using the new "Search the web" feature. When a user asks a question or needs information beyond the AI’s general knowledge, ChatGPT uses “Search the web” to gather current, relevant information directly from the internet.

Why is the distinction between citations & search important?


Since the release of “Search the web”, some trusted sources have confirmed “all citations include “utm_source=chatgpt.com.” According to our testing, citations are certainly the most likely scenario where UTMs are supplied – but just because a link is supplied as a citation does not guarantee UTMs will absolutely always be included. We’ve also seen partners of OpenAI get the “utm_source=chatgpt.com” treatment.

What does feel like a guarantee (again from our limited testing) is that search results will not include UTMs by default. Understanding whether ChatGPT is providing your link as a citation or a search result is a good first step in knowing whether UTM parameters (or referral information) will potentially be added to ChatGPT’s result link(s) by default. 

That said, unfortunately our testing reveals whether results are returned as citations and/or search results is not directly tied to user selection of the ‘Search the web’ button upon query. 

Since ChatGPT is capable of returning citations AND search results within the same response, the logic ChatGPT uses to decide when to include UTMs (and not) is a lot more complicated. Buckle up for that breakdown in the next section. 

When Does ChatGPT Add UTM Parameters to Result Links?


ChatGPT’s decision to include or exclude UTM parameters is based on a nuanced, multi-step process designed to balance user privacy with digital attribution needs. Here is a breakdown of the logical steps ChatGPT executes before deciding to provide result links with UTMs or not:

What's the First Thing ChatGPT Considers?


Any ChatGPT user by now generally knows how prompting directly affects and controls output / results, and those same reasons are exactly why it’s the first logical step here as well. 

Here’s how ChatGPT executes this first step based on the user’s request: 

  1. When users ask explicitly for a specific URL (e.g., “Can you give me the link to [site]?”), ChatGPT typically treats this as a “search result” and provides the link without UTM parameters. This is because ChatGPT tries to respect user intent and provide a straightforward response approach. 
  2. When users do not ask explicitly for a specific URL and ChatGPT provides a URL as part of a broader response (e.g., “Here’s more about [topic]”) or as a “Source” link (citation), it tends to default to UTM tagging. 

How Does Sensitivity & Privacy Affect UTM Tagging?


Next in the pecking order, ChatGPT evaluates the context of the request from a privacy perspective in whether or not to include UTMs. Here are some examples where ChatGPT may decide to omit UTM parameters from results regardless of whether links are provided as citations or search results

Sensitivity of Topic: If the conversation involves sensitive or confidential information (e.g., medical, legal, or financial topics), the link is generally supplied without UTM parameters to prioritize topic sensitivity.

User-Specific / Account-Based Content: Additionally, If the link would direct the user to account-specific pages or user-specific content (e.g login portals), UTM parameters are likely to be omitted as these pages are intended solely for private use and do not benefit from tracking.

What Role Do Publishers and Platforms Play?


Following privacy considerations, ChatGPT evaluates whether UTM parameters are appropriate based on publisher and platform requirements. In this step, ChatGPT considers (and predicts) based on the nature of the request, if it would likely be beneficial to content creators or align with plausible platform standards to include UTM parameters. 

With that in mind, here are examples where UTM parameters are likely to be included based on these requirements:

Source Citations from Publishers: When ChatGPT cites sources where tracking can support publisher content creators (e.g., citing an article from The New York Times in a general news response), UTM parameters are likely to be added to help publishers understand how much traffic originates from ChatGPT. 

Platform Environments with Standard Tracking Protocols: In some environments, link tracking is a routine practice. For instance, if ChatGPT is used within a corporate knowledge base platform or analytics dashboard, UTM parameters might be added to measure user engagement.

Key Takeaway

Regardless of whether “Search the web” was used with a ChatGPT query or not, the above logic is executed consistently to determine whether a UTM-based result link is supplied or not. Generally speaking, as long as the user’s intent, sensitivity of the topic, and privacy of the request are not in doubt, ChatGPT will tend to provide UTMs when citations are used and will not with search results

How Will ChatGPT Traffic Appear in My GA4 Reports?


Now that we’ve outlined ChatGPT’s logic for including UTMs, here’s what you can expect in GA4 reporting when UTMs/referral information is / is not included: 

Result Links with UTMs or Referral Information: Traffic from ChatGPT with a utm_source parameter = ‘chatgpt.com’ will appear with that value for the “Session Source” dimension in GA4. Of course if other UTM parameters (e.g. utm_medium, utm_campaign, utm_content, utm_term) are also included in the future, those parameters will also populate their respective dimensions of similar name. If UTMs are excluded but referral information is included (as a result of the document.referrer variable being populated) inbound traffic is likely to be recorded in GA4 reporting with: 

  • Session primary channel group (default) = Referral
  • Session Source = chatgpt.com
  • Session Medium = referral
  • Session Campaign = (referral)
  • Page Referrer = https://chatgpt.com/

In either case, you will be able to directly report on traffic originating from ChatGPT. That said, unfortunately there is no way to granularly measure / differentiate whether users had the "Search the web" feature enabled when reporting on traceable traffic from ChatGPT.

Result Links without UTMs or Referral Information: When ChatGPT links don’t include UTM parameters or referral data, traffic will likely appear as “Direct” in GA4. This can lead to inflation and overreporting of ‘Direct’ traffic and skew channel performance metrics. 

What Can I Do to Improve ChatGPT Traffic Attribution?


Now that you have made it this far and have an understanding of the logic ChatGPT uses for supplying UTM links and have a grasp of the reporting implications – you may be feeling like there’s nothing you and your team can do to increase the likelihood that UTM links are supplied when referencing your webpages to allow you to measure AI-driven traffic.

The reality is there are still things you and your business can do to improve your chances of getting UTM-tagged links from ChatGPT. 

Optimize for Citation Use Cases:


Focus on creating high-quality, authoritative content that’s well-cited and search-optimized. This includes content structured to address common user queries (e.g., lists, guides, FAQs) that are more likely to be returned as citations by ChatGPT, increasing the chances that your link will be cited with UTM tags.

Establish a Standardized UTM Tagging Taxonomy:


Ensure your organization has a standardized UTM tagging strategy. This is of course important for general campaign tracking and reporting, but this becomes especially important for content that’s syndicated or shared with trusted partners – such that if ChatGPT picks up and recommends your content through partner pages or citations, those URLs that are picked up are likely to retain UTM tags,.

Embed UTMs in Suggested URLs and Indexed Content:


Similarly, ensure any URLs you publicly share (meta descriptions, public PDFs, or marketing content) contain UTM parameters by default. By embedding UTM tags directly into these assets, you increase the likelihood they’ll be referenced as-is with UTMs in ChatGPT. Specifically, ensure that UTM-tagged URLs are embedded in primary and secondary web pages that AI tools might (or you aspire to) access. 

Instruct Custom GPTs to Include UTM's by Default:


For businesses producing Custom GPTs and deploying to the GPT Store for their customers, try to set up specific custom instructions that remind and enforce ChatGPT to include UTM-based links in its results. Specifically, you could write a custom instruction that enforces URLs to be dynamic and include UTM parameters that align with your UTM naming convention.

Is It Safe to Use UTMs for Internal Links Now?


Those of you who have extensive experience with leveraging UTMs for your marketing measurement may recall the long-standing industry best practice to reserve using UTMs only for inbound traffic (and to avoid for internal linking). This is because platforms like Universal Analytics historically used the presence of UTMs in URLs to signal and represent a new user session – so thereby using UTMs for internal linking would result in skewing and inflating session and channel reporting. 

However, with the understanding that UTM links (wherever they live on your site) are likely to be indexed by ChatGPT AND the fact that GA4 does NOT automatically generate new sessions like UA did, businesses can confidently experiment with the upside of using internal UTM linking techniques without fearing disrupted analytics.

Are our customers there & how do we track them?


The October 2024 release of ChatGPT with Search opens new doors for businesses to capture AI-driven traffic, but accurate tracking hinges on understanding when ChatGPT includes UTM parameters or referral information and the strategies you can use to influence the likelihood of attributable links being supplied. These things change all the time, so who knows how long they will or won't work, but we know you are going to be getting questions from your leaders in the c-suite asking...are our customers there and how do we track them?  Budgets have to be set, tools need to be selected, and shrugs of "I don't know how to track that" won't be acceptable.

 

We love helping marketers like you.

Sign up for our newsletter for forward-thinking digital marketers.

Jonathan Wehausen
Jonathan Wehausen
Sr. Lead (Client), Analytics