A Super Guide To Blocking Referrer Spam In Your Google Analytics
The situation has been getting worse over the years, meaning that someone somewhere makes a lot of money from creating referral spam.
Ghost and Referral Spam
Spam has now
made its way to Google Analytics reports. Spammers look for
vulnerabilities in the system so that they can appear in the website's
data reports. They do this with the of hope that they spark enough
curiosity to the point that the webmaster visits their website to see
why they are in the report. The problem is that they do not increase
traffic. They do not even make it since they are bots. They use the
JavaScrip tracking code used by Google Analytics to create a
notification that there was a visit. They end up skewing vital
statistics like bounce rates and other elements used to analyze
engagement. It is imperative to block referral spam if one needs
accurate data especially if they rely on it to make marketing decisions.
It becomes hard
to block referral spam especially since the spammers work very fast,
increasing the rate of spam hits as well as the sources. It means that
webmasters need to improve on the effort they put in eliminating and
blacklisting these sources. It is particularly troublesome to people who
have new sites who do not receive much legitimate traffic. An increase
in spam rates on such sites would present more skewness which might even
be more than the daily hits it receives.
How Easy Is It?
One page load
records as a single visit. Ghost spammers use the Google Analytics
tracking code and send traffic data straight to the reports, thereby
forging a visit. It may take 0.001 seconds to load a single page on a
server somewhere. However, they may have forced over 100 of these forged
visits onto the Google accounts of many other sites all over. It is
quite easy to buy a single host. As long as the spammers are sure of
ROI, there is a lot of damage they can do with them.

Solutions that Come Up Short
Some techniques
are sometimes so advanced that the solutions employed to block referral
spam do not work. One of them is the mysterious online service called
Darodar. The following methods did not clear it from GA.
- The .htaccess file. It does not work since ghost spam does not touch the site
- The referral exclusion list. It lacks updates.
- Exclusion filters. It is outdated method since it only focuses on future spam and not retroactive for past spam databases.
The Exclusion
filter almost came close to eliminating the Darodar referral spam. Its
only limitation was that it does not have a constant and consistently
updated referral spammer list.
The Missing Puzzle Piece
An actionable
solution to identify and block referral and ghost data should be very
updated, come from a broader database, and retroactive to past
information. Based on the three elements for an optimal solution, here
is one that works.
Step 1: Using Segments to Exclude Spam
It is better to
use segments since they do not alter data permanently. If one
accidentally filters out real referrers while using filters, there is no
way of getting them back. It is possible to build on old data using
segments, despite how long it has been there. One can also apply them
retroactively.
Step 2: Maintaining the Exclusion List
Slack is a tool
that webmasters can use to monitor referral sources. It notifies the
user concerning any new referrals and gives them a prompt: whether to
whitelist or blacklist a suspicious referral source.
1. Slack receives all referrals, and
2. It uses a PHP
to sort all the results by order of count, and then loops the final
list to the webmaster to see if any looks familiar. If not,
3. It forwards
all the suspected spam to a slack channel which offers the user a choice
between a whitelist or a blacklist. Whichever option they choose, it
leads to step 4,

4. It redirects to a page that verifies the verdict as a selection confirmation.
5. Slack then stores and locks all identified spammers in the database
6. The final display of clean data will be in regex format. Copy and paste it in Google Analytics.
Slack allows the webmasters to update the exclusion list at least five times a day.
In Reality, Several Solutions Can Work:
Despite this
being a proven method, it would work even better if the webmaster
supplements it with other techniques, just to make sure they cover all
bases. In addition to the said solution:
- Click on the checkbox that prompts Google Analytics to exclude known bots and spiders,
- Apply an "include hostname filter,"
- Use cookies
The inclusive filter mentioned above is efficient sometimes, but not the best solution in the long run because:
- Hostname spoofing is not difficult to do, and analytics spammers are increasingly using it as a vulnerable.
- If the setup is wrong, it might end up filtering out real referrers.
Post a Comment