Some time ago, while performing daily maintenance work for one of our large clients, we noticed that some of the external links that they had had some weird hash in them. What was interesting is that all these external links that had this problem were domains owned by the client. A little digging into the HTML code unveiled that the problem was caused by HubSpot.
We communicated this issue to HubSpot, and they confirmed to us that this is because the client elected to track the outgoing traffic from their main website to sister websites, and that was the only way for HubSpot to do the cross-domain tracking. At that time (that was months ago), we told HubSpot that we were concerned that these weird hashes may cause content duplication issues, as they may get indexed by Google. They replied that this shouldn’t have a problem whatsoever, since all the pages on the website had canonical URLs, which means that these weird hashes will not result in many variations of the same pages, which in turn means that Google will not index those variations.
Fast forward to a couple of weeks ago (yes – we prepared this post a couple of weeks ago – but we didn’t have the time to publish it until today), where it was a completely different story. The moment we logged in to the Google Search Console of that particular website, we discovered that thousands upon thousands of pages such as http://www.[ourclientjoomlawebsite].com/page-1.html?__hstc=[long-hash]&__hssc=[medium-hash]&__hsfp=[small-hash] (where long hash, medium hash, and small hash were long, medium, and short random strings) were 404 pages. Huh?
After analyzing the issue while in super panic mode, we discovered the sequence of issues that were causing this mess:
- The Google bot visits a page on the website.
-
The visited page contains a link to a non-existent page on a sister website.
-
The link has the HubSpot tracking code appended to it.
-
Google tries to visit the link and thus results in a 404 error for that link.
-
Since the link is a 404 link, then the whole canonical concept does not apply to it, which means that Google ends up visiting many variations of that same 404 link, and thus resulting in many 404 errors. This is a huge issue because it highly inflates the 404 count of the website (which has negative SEO impact.
Of course, the root of the problem was the 404 link, but the problem was amplified by HubSpot‘s tracking code. We brought up this issue with the client, and we presented them with a couple of options:
- We fix the 404 links – which means that we will risk having the same series of events in the case of new 404 links.
-
We disable tracking on auxiliary (sister) websites.
The client went with the second option, mainly because they were completely disturbed by all these weird hashes they were seeing on their website. As usual, we happily obliged! Here’s how:
- We logged in to the HubSpot account of the client.
-
We clicked on Reports -> Reports Settings.
-
Under Domains, we unchecked the Enable checkbox under Automatic cross-domain linking.
-
We saved the form and then we checked the website, and hooray, all these annoying hashes were gone!
We think that HubSpot‘s strategy for tracking outgoing traffic is somehow flawed: the whole concept of adding weird hashes does not fly well on serious websites. Additionally, as we have demonstrated above, this strategy can cause SEO issues.
If you see weird hashes in some of your URLs after adding HubSpot to your website, then you should uncheck the Automatic cross-domain linking feature in HubSpot. If you need help in doing that, then just contact us and we will gladly do it for you. Our fees are affordable, our Joomla expertise is second to none, and we will strive to be your friends (but not in a cable guy kind of way).