Semalt claims to be an SEO and Analytics tool, but word around the web suggests they are, at best, a cause of referral spam.
Referral spam, in general, is when a robot (think the Googlebot, a site crawler robot that helps index your site in Google’s search results) is sent to check out your website, but pretends to be a person by using falsified referrer header data.
This is typically done via falsifying the Referrer and User Agent parts of the header data.
The User Agent header tells the website and tools like Google Analytics what the visitor used to access the site.
Here’s a User Agent example from Chrome running on Windows 8:
Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.93 Safari/537.36
Googlebot, your friendly neighborhood web crawler and search indexer, identifies itself with User Agents that look like this:
- Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
- Googlebot/2.1 (+http://www.googlebot.com/bot.html)
- Googlebot/2.1 (+http://www.google.com/bot.html)
The reason that Semalt appears in your Google Analytics? It pretends to be a human better than most bots.
It’s up in your business, pretending to be a real visitor and possibly perpetuating further spam using your site. At minimum, it’s mucking up your data.
How to handle this?
You can deal with Semalt and bot traffic in one of three ways (or all three, honestly).
Option 1: Use GA’s autobot blocker
No it’s not going to prevent the Transformers from finally showing up to give you the birthday you’ve wanted since you were five.
Go to the View settings for your website’s GA account. Check the Bot Filtering box. Save settings.
If you decide to use this feature, you should make sure to create a “Raw Data” View without this – or any other – filter in place. This will give you a point of comparison in the event your data looks really unusual after implementing a major filter change.
Option 2: Create a filter
Creating Filters in Google Analytics sounds a little scary, but it’s not.
Step 2: Name your Filter. Something like “Semalt” will do just fine.
Step 3: Click on Custom
Step 4: Select Referral from the Filter Field list
Step 5: Type semalt\.com into the Filter Pattern box. (The \ is important because this field uses Regular Expressions!)
Step 6: SAVE!
Note: If you’re using Filters you absolutely should be setting up a Raw Data profile! Making mistakes with Filters can throw your data off pretty badly, and having a Raw Data (aka no Filter) profile gives you a way to sanity check your data, and provides a backup in case of data lost to Filter mishaps.
This filter will not change your historical data, sorry. You’re stuck with your prior Semalt traffic in your reports. But from the time you make the Filter live onwards, Semalt as a Source/Referrer will no longer appear in your reports. This only cleans up your reports, it doesn’t keep them from crawling your site.
Option 3: Knock knock! who’s there? NOT YOU! (AKA the htaccess option)
So, you’ll need to actually be able to edit your site’s .htaccess file, and it helps to have a basic idea of how to use it (or at least not break it).
But the good news is, blocking them is as simple as adding the following to the bottom of your .htaccess:
SetEnvIfNoCase Referer semalt.com spammer=yes
Allow from all
Deny from env=spammer
(Source: WordPress Codex example)
If that’s not working – and you can check that Raw Data profile you set up to figure out if it is or not 😉 – or you’re getting a bunch of new garbage, then it’s time to hit more .htaccess resources. Your web host may be able to help you out, too.
So that’s it!
For another Google Analytics Filter set to get rid of other bots using the “Mozilla Compatible Agent” browser User Agent, check out this Filter from LunaMetrics.