Category : Analytics Setup
How to remove referral spam from historical data in Google Analytics
This is a quick follow-up to my guide on how to exclude referral spam from your Google Analytics data. Filters exclude or modify the data from the time you add them and don't have any effect on previous traffic. This is where segments are very useful. Not only can you use a segment to view a cleaner version of your historical data but you can also test the setup of your filters. I've also found the Google's filter verification option quite unreliable but with the segment, you can verify the results yourself and see results straight away. Here I am going to show how to add segments to include valid hostnames and exclude spam referrals from your data. Add a segment to include valid hostnames Creating a filter to include visits from valid hostnames only is the first step you need to take to exclude spam referrals from your Google Analytics data. Test your valid hostnames regex by firstly going to Audience > Technology > Network > Hostname. Create a filter by clicking on ‘Add segment’ and then ‘New Segment’. Now select the conditions tab on the left, under advanced. Set up your filter with the following conditions: Sessions Include Hostname Matches regex (and your regex, eg yoursite|googleusercontent, in our case it's littledata|googleusercontent) Click on ‘Preview’ button on at the top to check which hostnames you are left with. Your list should look much cleaner and only display domains you used in regex. Add a segment to exclude referral spam Like before, you want to test this trigger when viewing a relevant report so go to Acquisitions > All Traffic > Referrals. Create a segment with the following details: Sessions Exclude Medium exactly matches referral AND Source matches regex (and your regex) Whilst filters have a limit of 255 characters, the advanced segment has much more character space to use. I've bundled all spam referrals into one long regex of 900 characters. But as explained in the guide on removing spam traffic you might have to break it up into multiple expressions or filters to fit them all in. By adding those two segments you can not only test that your filter setup is accurate but also view your historical data without fake traffic. If you need help with any of the above, leave a comment below or get in touch!
How to remove referral spam from Google Analytics
The issue with the referral spam in Google Analytics exploded in May when we saw an average of 620 spam sessions per GA property and just the other week, I saw an account where spam accounted for 95% of the traffic! Spam referrals are greatly skewing your Google Analytics traffic and becoming a headache for a larger number of people. Why are these spam sessions appearing in your Google Analytics traffic? To get you click through to their site and ads (never ever do that, by the way). By targeting thousands of GA accounts like this, you can imagine how much traffic they get from those more curious about their new source of visits. There are two different types of spam referrals you are getting: Ghost referrals send fake traffic to your GA account by “attacking” random GA property IDs. Crawler referrals crawl your website to leave a mark in your traffic. The spam referrals are getting more persistent and clever by targeting other non-referral reports, like www.event-tracking.com appearing in events. How can you tell it's spam? By seeing unusual activity, odd referral sources, substantial changes in your metrics, and lots of (not set) values in various dimensions, eg hostname and language. So how do you remove spam referrals from your Google Analytics traffic? There are two filters you need to set up to remove both ghost and crawler spam referrals. Filters change your traffic permanently so if you don't have an unfiltered view of your data, then create one now. It's a good practice to have an unfiltered view that you don't modify and it allows you to check your filters are working correctly. We are also working on our own spam filter tool to help people get rid of pesky spam referrals with just a few clicks of a button. We have already released a beta version via our Littledata analytics reporting tool and are developing it further to make it more robust and comprehensive. But if you'd rather do it yourself, keep reading. Create a filter to include valid hostnames Since ghost referrals never actually visit the site, the best way to get rid of them is by creating a valid hostname filter. This filter will allow visits from “approved” websites that you consider valid. First, you will need to identify your valid hostnames by going to the report in Audience > Technology > Network > Hostname. Hostnames report shows domains where your GA tracking code was fired and helps to troubleshoot unusual traffic sources. Valid hostnames on the list will be the websites where you inserted the GA tracking code, use additional services, eg transactions, or reliable sites used by people to access your site, eg Google Translate. Your reliable hostnames could look like this: www.yoursite.com yoursite.com blog.yoursite.com translate.googleusercontent.com (user accessing your site via Google Translate) ecommercepartnersite.com webcache.googleusercontent.com (user accessing translated cached version of your site) Any other website that you do not recognise or looks suspicious, you can safely assume to be a hostname you want to exclude. Beware of any domains that appear as “credible sources", eg Google, Amazon and HuffingtonPost. They are used to mask the spammers. If you see (not set) hostname on your list, this could be because you're sending events to GA that don't have pageviews, for example tracking email opens and clicks. If you are sure you are not sending any such events to GA, you can also exclude any (not set) hostnames. Now that you have got your valid hostnames, you need a regular expression for a filter that will include your valid hostnames (and thus, exclude all other fake ones). It'll look like this: yoursite|googleusercontent|ecommercepartnersite In the regex above, the vertical bar | separating each domain means OR. This will match any part of the string, so 'yoursite' will match 'blog.yoursite.com' as well as 'www.yoursite.com'. You can test your regex at http://regexpal.com/ by inserting your expression at the top and all the URLs at the bottom. All matches will be highlighted so you can see straightaway whether you have included all your valid hostnames correctly. Before adding the valid hostname filter in the settings, test it with an advanced segment. The results on the screen should now be only of your valid hostnames and without all the spammers. If all looks good, create a filter by going to Admin > View > Filters > New Filter. This will add a filter for that specific view only. If you want to add the same filter to more than one view, then check the details below. Select 'Include', pick a custom filter and select 'hostname' from the filter field menu. Now enter your regex into filter pattern field and click save. Want to apply a filter to multiple views? Then go to Admin > Account > All Filters > New Filter. The setup is exactly the same as above, except now you will see a section at the bottom titled 'Apply Filter to Views'. Select views you want to apply the filter to and move them to the right hand side box by clicking button 'add' in the middle. You're all set so click save. Add a filter to exclude campaign source Some of the known spam referrals are free-social-buttons, guardlink.org, 4webmasters.org and, most recently, the ironically named howtostopreferralspam.eu. Excluding spam referrals with campaign source filter is one of the most commonly mentioned methods online. This filter will exclude any referrer spam from the moment you add the filter (not from your historical data). The downside is that every time you have a new spam referral appear in your Google Analytics data you will have to add them to the existing filter, or create a new one if you’ve ran out of character space (allows only 255 characters). You can identify your spam referrals by going to referrals report found in Acquisition > All Traffic > Referrals. To save you some time, I have included the regex's we use below so you can copy them. Make sure you double check your referrals report against our list to see if there are any that haven't appeared in our reports yet. If you find a source not listed below, simply add it to the end and let us know in the comments. Similarly to setting up the filter to include valid hostnames only, now you need to add a filter to exclude spam referrals. We use the following regular expressions to filter out spam (yes, that's four filters): guardlink|event-tracking|vitaly rules|pornhub-forum|youporn-forum|theguardlan|hulfingtonpost|buy-cheap-online|Get-Free-Traffic-Now|adviceforum.com|aliexpress.com|ranksonic kabbalah-reg-bracelets|webmaster-tools|free-share-buttons|ilovevitaly|cenoval|bestwebsitesawards|o-o-6-o-o|humanorightswatch|best-seo-offer|4webmasters|forum69.info|webmaster-traffic|torture.ml|amanda-porn|generalporn depositfiles-porn|meendo-free-traffic|googlsucks|o-o-8-o-o|darodar|buttons-for-your-website|resellerclub|blackhatworth|iphone4simulator.com|sashagreyblog|buttons-for-website|best-seo-solution|searchgol|howtostopreferralspam 100dollars-seo|free-social-buttons|success-seo.com|videos-for-your-business.com The reason majority of the websites above do not have org/com/etc is that for these sites I have concluded that there are no other genuine sites with similar site names (or none that I could find) that would send our site traffic. So it is safe to exclude these sites by name only. For example, there are many sites with adviceforum in their name so to avoid excluding any potentially genuine sites that are called adviceforum, I only exclude the one spam referral I saw in my traffic - adviceforum.com. If you notice that you have referral traffic from addons.mozilla.org but don't actually have an addon on Mozilla, then you should add addons.mozilla.org (more commonly known as ilovevitaly) to the list above in this format - addons.mozilla.org Select Campaign Source in the filter field menu and enter your regex into the filter pattern field. Repeat the process until you have got all four (or more) filters created. This will help to clean up your Google Analytics data but you have to keep checking for any new spam referrals to add to the exclude filter. You can use advanced segments to view your historical reports without spam referrals. If you need help with any of the above or have further questions, don't hesitate to let me know in the comments. Further reading: 5 common Google Analytics setup problems How to remove referral spam from historical data
5 myths of Google Analytics Spam
Google Analytics referral spam is a growing problem, and since Littledata has launched a feature to set up spam filters for you with one click, we’d like to correct a few myths circulating. 1. Google has got spam all under control Our research shows the problem exploded in May – and is likely to get worse as the tactics get copied. From January to April this year, there were only a handful of spammers, generally sending one or two hits to each web property, just to get on their reports. In May, this stepped up over one thousand-fold, and over a sample of 700 websites, we counted 430,000 spam referrals – an average of 620 sessions per web property, and enough to skew even a higher traffic website. The number of spammers using this tactic has also multiplied, with sites such as ‘4webmasters.org’ and ‘best-seo-offer.com’ especially prolific. Unfortunately, due to the inherently open nature of Google Analytics, where anyone can start sending tracking events without authentication, this is really hard for Google to fix. 2. Blocking the spam domains from your server will remove them from your reports A few articles have suggested changing your server settings to exclude certain referral sources or IP addresses will help clear us the problem. But this misunderstands how many of these ‘ghost referrals’ work: they are not actual hits on your website, but rather tracking events sent directly to Google’s servers via the Measurement Protocol. In this case, blocking the referrer from your own servers won’t do a thing – since the spammers can just go directly to Google Analytics. It's also dangerous to amend the htaccess file (or equivalent on other servers), as it could prevent a whole lot of genuine visitors seeing your site. 3. Adding a filter will remove all historic spam Filters in Google Analytics are applied at the point that the data is first received, so they only apply to hits received AFTER the filter is added. They are the right solution to preventing future spam, but won’t clean up your historic reports. To do that you also need to set up a custom segment, with the same source exclusions are the filter. You can set up an exclusion segment by clicking 'Add Segment' and then red 'New Segment' button on the reporting pages and setting up a list of filters similar to this screenshot. 4. Adding the spammers to the referral exclusion list will remove them from reports This is especially dangerous, as it will hide the problem, without actually removing the spam from your reports. The referral exclusion list was set up to prevent visitors who went to a different domain as part of a normal journey on your website being counted as a new session when they returned. e.g. If the visitor is directed to PayPal to pay, and then returns to your site for confirmation, then adding 'paypal.com' to the referral exclusion list would be correct. However, if you add a spam domain to that list then the visit will disappear from your referral reports... but still, be included under Direct traffic. 5. Selecting the exclude known bots and spiders in the view setting will fix it Google released a feature in 2014 to exclude known bots and spiders from reports. Unfortunately, this is mainly based on an IP address - and the spammers, in this case, are not using consistent IP addresses, because they don't want to be excluded. So we do recommend opting into the bot exclusion, but you shouldn't rely on it to fix your issue Need more help? Comment below or get in touch!
Setting up a destination goal funnel in Google Analytics
Destination goal funnels in Google Analytics track how well certain actions on your website contribute to the success of your business. By setting up a goal for each crucial activity you will get more focused reports on how visitors are using your website, and at what stage they are dropping out of the conversion funnel. The first time I tried to set up a destination goal was daunting, but after some practice, I am now seeing valuable information on how well visitors are interacting with our clients' websites. If like Teachable you have different subscription packages, then you might want to track how each subscription is converting. For this, set up the purchase confirmation page of each subscription plan as a goal, with a funnel to get additional insight into where people drop off. Step 1: Create a new goal To set up a destination goal go to Google Analytics Admin settings > View > Goals. Click ‘new goal.’ Step 2: Fill in destination goal details Google has some goal templates that provide set-up suggestions. They will only display if you have set your industry category in property settings. Selecting any of the given templates will only populate the name and type of the goal, but not the conversion details, which are more complicated for some. This is not very useful for me so I will ignore this: select ‘custom’ and click ‘next.’ Goal name Give your goal a descriptive name. You will later see it in various reports in Google Analytics so use whatever makes sense for you. Here I am going to use the name of the subscription plan I am tracking - Basic Subscription. Goal slot ID Goal slot ID is set automatically and you might want to change it if you want to categorise your goals. Select ‘Destination’ and click ‘next step.’ Step 3: define your destination goal Destination type You have a choice between 3 different match types. If you have an exact URL that does not change for different customers (without '?=XXX'), then use ‘Equals to’ for an exact match. If the beginning of your converting URL is the same, but there are different numbers or characters at the end of the URL for various customers, then choose ‘Begins with.’ Use ‘Regular expression’ to match a block of text within the URL. For example, if all your subscriber URLs have 'subscriber_id=XXX' somewhere then type 'subscriber_id=' into the text field. You can also use 'regular expression' if you need to match multiple URLs and know how to use special characters to build regex. One of our favourite tools to test regular expressions is Regex Tester. The match type you select here will also apply to the URLs in the funnel, if you choose to create one. Destination page Destination page is the URL where the conversion occurs. For Teachable, and most other websites that sell something online, the destination is usually a ‘thank you' page that is displayed after successful purchase. You might also have a thank you page for contact forms and newsletter signups, which you would track the same way as a payment thank you page. Here you insert the request URI, which is the URL part that comes after the domain address. It would look something like this: /invoice/paid /thank you.html /payment/success Step 4: Should you set a goal value? (optional) You can set a monetary value to your goal if you want to track how much it contributes. e.g. If the goal is visitors completing a contact form, and you know the average lead generates you £100, then you can put the value at 100. If you are an ecommerce site and want to track exact purchases, then set up enhanced ecommerce tracking instead. Step 5: Should you set up a funnel? (optional) If you have several steps leading up to the conversion, you should set up a funnel to see how many people move through each defined step and where they fall out. If you do not set the first step as 'required', Google Analytics will also track people coming into funnel halfway through. i.e. If the first stage of your funnel is the homepage, then it will still include visitors who land straight on your contact page. Verify Now that you have set up your destination goal, click ‘verify the goal’ to check it works. If all is set up correctly, you should see an estimation of the conversion rate your goal would get. If you do not get anything, then check each step carefully. Once all is well, click ‘create goal’ and check it is working after a few days or a week, depending on how much traffic you get. If you set up a funnel, you will see it in Conversions > Goals > Funnel Visualisation. This is what a typical funnel would look like. Because I did not set the first step as 'required' you can see people entering the funnel at various steps. Need more help? Get in touch or comment below!
How to audit your Web Analytics Ecommerce tracking
5 common Google Analytics setup problems
Can you rely on the data you are seeing in Google Analytics? If you use it daily in your business you should really give some time to auditing how the data is captured, and what glitches could be lurking unseen. The notifications feature in Google Analytics now alerts you to some common setup problems, but there are more simple ones you could check today. Here are 5 aspects of your Google Analytics account to check now. Are you running the latest Universal Analytics tracking code? Is your overall bounce rate below 10%? Are you getting referrals from your own website? Are you getting ‘referrals’ from your payment gateway? Have you got the correct website default URL set in GA? Are you getting full referring URL in reports? 1. Are you running the latest Universal Analytics tracking code? You may have clicked upgrade in the Google Analytics admin console, but have your developers successfully transferred over to the new tracker code? Use our handy tool to test for universal analytics (make sure you copy your URL as it appears in the browser bar). 2. Is your overall bounce rate below 10%? The 'bounce rate' is defined as sessions of only one page. It’s highly unlikely to be in single digits unless you have a very unique source of engaged traffic. However, it is possible that the tracking code is firing twice on a single page. This double counting would mean Google Analytics sees every single page view as two pages – i.e. not a bounce This is more common on template-driven sites like Wordpress or Joomla, where you may have one tracking script loaded by a plugin – and another pasted onto the main template page. You can check if you have multiple pageviews firing by using the Google Tag Assistant plugin for Chrome. 3. Are you getting referrals from your own website? A self-referral is traffic coming from your own domain – so if you are www.acme.com, then a self-referrals would be appearing as ‘acme.com’. Have a look at the (recently moved) referrals list and see if that is happening for you. This is usually caused by having pages on your website which are missing the GA tracking code, or have it misconfigured. You can see exactly which pages are causing the problem by clicking on your domain name in the list and seeing the referring path. If you are on universal analytics (please use our tool to check) you can exclude these referrals in one step with the Referral Exclusion list. For a fuller explanation, see the self-referral guide provided by Google. 4. Are you getting ‘referrals’ from your payment gateway? Similar to point 3: if you have a 3rd party payment service where customers enter their payment details, after they redirect to your site – if you are on Universal analytics – they will show up as a new visit… but originating from ‘paypal.com’ or ‘worldpay.com’. You need to add any payment gateway or similar 3rd party services to that referral exclusion list. Just add the domain name - so PayPal would be 'paypal.com' 5. Have you got the correct website default URL set in GA? When Google Analytics was first set up for your website you may have set a different domain name than what you now use. Or maybe you have switched to run your site on https:// rather than http://. So you need to change the default URL as set up in the admin page. For this go to Admin > Property > Property Settings. Once that is setup correctly, the ‘All Pages’ report becomes a lot more useful – because you can click through to view the actual page using the open link icon. Advanced: Are you getting full referring URL in reports? If you run your website across different subdomains (e.g. blog.littledata.co.uk and www.littledata.co.uk) then it can be difficult to tell which subdomain the page was on. The solution to this is to add the hostname to the URL using a custom filter. See the guide on how to view full page URLs in reports. What other setup issues are you experiencing? Let us know in the comments or by tweeting @LittledataUK.
6 helpful Google Analytics guides
I've been improving my knowledge of Google Analytics this month but found that documentation provided by Google and other heavy research can be difficult to absorb. So here are 6 guides and tools that I found useful in the last month. How to set up campaign tracking Expertise level: Newbie Social media analytics: How to track your marketing campaigns by Cory Rosenfield. When you run an ad, email or social promotion, you want to see which channel is most effective in acquiring visitors. By gathering this information through tracking your campaigns you will be able to focus on winning strategies and make adjustments to less performing ones. Cory’s how to guide takes you through the basics of how to set up campaign tracking with relevant explanations and practical examples. It’s as easy as it gets. What metadata needs fixing Expertise level: Beginner Introducing the Meta and Rich Snippet Tester by Bill Sebald. This tester from RankTank compares your site’s meta and rich snippet data to what you have in your site’s code. You will be able to see mismatches between how you have set your titles and descriptions against what is actually displayed in search results. Want to make sure rich snippets are working correctly or Google doesn’t replace missing meta tags with something unsuitable? Then this tool is for you. How to do keyword research effectively Expertise level: Intermediate Keyword research in 90 minutes by Jeremy Gottlieb. Keyword research for improved content targeting can take a lot of time but it doesn’t have to. Jeremy’s plan splits it into a 4-stage process, full of handy tips on how to spend your time effectively. Especially useful for when planning topics for your blog posts and finding words that are most relevant to include in your product descriptions. Setting up alerts for site errors Expertise level: Intermediate Google Analytics custom alerts which you must always use by Himanshu Sharma. How can you find errors and problems on your website with minimum manual labour? Set up custom alerts in your Google Analytics account with Himanshu's guide. You can create notifications for tracking and shopping cart issues, and any unusual changes in your bounce rate and traffic. How to improve multiscreen experience Expertise level: Advanced Enabling multiscreen tracking with Google Analytics by James Rosewell. This step by step guide by James shows how to get better data on the use of your site across various mobile devices. You will be able to make informed decisions on optimising your site whilst taking into consideration screen sizes and layouts. This means improved experience for customers on bigger smartphones and smaller tablets. Source: Infinium.co What were the different variables again? Expertise level: Advanced Variable guide for Google Tag Manager by Simo Ahava. Variables in Google Tag Manager can be powerful, once you get to grips with them. Simo's comprehensive guide is a useful reference that covers everything you need to know from technical details to set ups and debugging. Source: SimoAhava.com Need some help with Google Analytics? Get in touch with our experts!
Subscribe to Littledata news
Insights from the experts in ecommerce analytics
Try the top-rated Google Analytics app for Shopify stores
Get a 30-day free trial of Littledata for Google Analytics or Segment