Tracking Outbound Links — The Right Way

Let’s say you want to run an experiment on a page in the foo.com domain, but you want to
register conversions in some outbound domain, say, bar.com. One way to handle this is to implement a
strategy called Google Analytics Cross Domain Linking as described in this Analytics Help Center Article.

This is a fine solution, but it suffers from two major problems. First, it requires that you have the rights to modify pages in the outbound domain. Secondly, and frankly (IMHO), it’s kinda a pain in the ass to implement as well as being very error prone.

An alternative to tracking the loading of the outbound page, is to track the user’s action of clicking the link to the outbound page. This is neatly described in this Analytics Help Center Article.

The only problem with this technique is that it really does not work very well. It suffers from what we in the industry
call a Race Condition. To understand this particular race condition, allow me to describe a little about how browsers work.

When a web browser is loading a page and it encounters something like the following:

... la la la
<img src="a.jpg">
la te da ...

The browser does not stop at the image tag in order to load the image. In fact it does not even stop at the image tag to even start loading the image. It simply queues up a request to load that image at some later time. Later in this case means really quick; probably in the next few milliseconds. It may do this with another thread, or simply schedule it within the same thread. The important point is that the HTTP request to the server which services the image does not take place right away.

Now, consider the following HTML:

... One two three
<a href="target.htm">Click Me!</a>
... four five six ...

Here, when a visitor clicks on this link, the mechanism for loading and displaying the target page is very similar to the loading of the image above. A request to start loading that page is queued up, and when the bytes of that response start arriving, the browser erases the current page and starts rendering the new page. This is why after clicking on a link to a “distant” and slow site, you will continue to see the current page until the other responds — there is no good reason to clear the screen on the current page until you have something new to display.

One more vital piece of information needs to be mentioned here. When a page is closed, in this case in favor of loading a new page, all the outstanding resource requests for the current, closing, page (like images) are abandoned. This fact will place a crucial role in our race condition.

Now, let’s look at the code mentioned in the How do I manually track clicks on outbound links? article:

<a href="http://www.example.com" onClick="javascript: pageTracker._trackPageview('/outgoing/example.com');">

The script in the onClick handler is intended to create a Google Analytics event (called /outgoing/example.com) with the _trackPageview operation. The way that _trackPageview works is that it makes a request to Google for an, essentially, empty image. Along with that request, is the information about the visitor and what should be tracked. It’s how Google Analytics gets its information in order to create reports. Now, just like the loading of an image tag, this request is also queued up by the browser — It’s not immediately requested.

Once the call to _trackPageview returns, the browser then starts the request for the outbound resource, “http://www.example.com”, in this case. This too is queued up, and when it start to come in, the page will clear and the new page will be rendered.

Now, we have enough information to see where the race condition exists.

If the request for example.com comes back really quickly, it is quite possible that the request for the Google Analytics tracking image has not yet taken place. In fact, some browsers may prioritize requests for images below other requests, like those for other sites. When this happens, all outstanding resource requests for the current page, and I’m thinking about the Google Analytics tracking request in particular, are abandoned. This means that the event which was to be tracked via Analytics is lost to Analytics, as though it never happend.

Bummer.

Give It Some Time …

So, how does one track outbound links properly?

The trick is to give the request for the Google Analytics tracking image enough time to take place. This can be done by delaying the request for the outbound page with the following technique:

<script type="text/javascript">
function doGoal(that) {
try {
var pageTracker=_gat._getTracker("UA-123456-1");
pageTracker._trackPageview("http://www.example.com");
setTimeout('document.location = "' + that.href + '"', 100)
}catch(err){}
}
</script>
<a href=”www.example.htm” onclick=’doGoal(this);return false;’>Click me</a>

Here, notice that the onClick handler calls a function which, first, does not rely on the presence of a global pageTracker object to have been already set up. It creates it’s own tracking object. This reduces the dependency on other scripts running on the page.

Secondly, the return value from the onClick handler is false. This prohibits the browser from following the link as a consequence of the user clicking on the link. This stops the browser from immediately navigating to example.com.

Thirdly, notice the call to the setTimeout function. The setTimeout function’s job is to execute a piece of code at some time in the future, without blocking the current script from continuing executing. In this case, it’s 1/10 of a second into the future, and the code to execute is that which is, essentially, the same as what the browser would have done if true (or nothing) had been returned by the onClick handler. Setting the location property of the document object with the outbound href will cause the page to navigate to that link.

By delaying the outbound navigation by 1/10 of a second (which is generally not noticed by the user), the browser now has much more time to make the Google Analytics tracking request and the tracking event will be noticed and reported on by Google Analytics.

Spiffy.

The example above applies to tracking outbound links in Analytics, and is trivially adapted to tracking Google Website Optimizer goals as well. You can see an example of a test page taking advantage of this technique here to track as the goal, clicking on a link.

The essential code in the GWO sample page follows. All that is really different is that the argument to _trackPageview is the token string for the GWO goal.

<script type="text/javascript">
function doGoal(that) {
try {
var pageTracker=_gat._getTracker("UA-7250447-1");
pageTracker._trackPageview("/2353623095/goal");
setTimeout('document.location = "' + that.href + '"', 100)
}catch(err){}
}
</script>
<a href=”anotherpage.htm” onclick=’doGoal(this);return false;’>Click me</a>

Happy Clicking!

 

Comments:

Thanks for that explanation – very nice. Keep the blog posts coming please; they have all been very useful!

Thanks for the explanation how browsers process a link request.

The setTimeout call fails when the link contains a double quote.

A safer way to implement the doGoal function:

function doGoal(that) {
var navigatePage = function() {document.location=that.href;};
try {
setTimeout(navigatePage, 100);
}catch(err){}
}

Thanks. Especially the link to that GWO test really helped. Just one question: What did you enter as URL of the goal page in GWO? The “goal page” doesn’t really exist, does it? And if it exists, it is TWO pages (the default and the alternative).

You can enter the test page as the goal page in the GWO UI. You are correct in that there really is no goal page. The goal is the click. This technique is used in the absence of a goal page. One will still need to create multiple test pages, either via MVT or A/B.