This is a quick follow up to my guide on how to exclude referral spam from your Google Analytics data. Filters exclude or modify the data from the time you add them and don’t have any effect on previous traffic. This is where segments are very useful.
Not only can you use a segment to view a cleaner version of your historical data but you can also test the setup of your filters.
I’ve also found Google’s filter verification option quite unreliable but with the segment you can verify the results yourself and see results straight away.
Here I am going to show how to add segments to include valid hostnames and exclude spam referrals from your data.
Add a segment to include valid hostnames
Creating a filter to include visits from valid hostnames only is the first step you need to take to exclude spam referrals from your Google Analytics data.
Test your valid hostnames regex by firstly going to Audience > Technology > Network > Hostname.
Create a filter by clicking on ‘Add segment’ and then ‘New Segment’.
Now select the Conditions tab on the left, under Advanced.
Set up your filter with the following conditions:
Matches regex (and your regex, eg yoursite|googleusercontent, in our case it’s littledata|googleusercontent)
Click on ‘Preview’ button on at the top to check which hostnames you are left with.
Your list should look much cleaner and only display domains you used in regex.
Add a segment to exclude referral spam
Like before, you want to test this trigger when viewing a relevant report so go to Acquisitions > All Traffic > Referrals.
Create a segment with the following details:
Medium exactly matches referral
Source matches regex (and your regex)
Whilst filters have a limit of 255 characters, the advanced segment has much more character space to use. I’ve bundled all spam referrals into one long regex of 900 characters. But as explained in the guide on removing spam traffic you might have to break it up into multiple expressions or filters to fit them all in.
By adding those two segments you can not only test that your filter setup is accurate but also view your historical data without fake traffic.
If you need help with any of the above, leave a comment below.