A Tale of Edinburgh University Press and search engines
This weekend I was doing some research for additional content for my Scottish Books site and had occasion to do a Google search for Edinburgh University Press. To my surprise their site didn’t appear in the first page of results, or the second, or the third.
Intrigued, I found a link to it on one of the sites that did rank, (it’s http://www.euppublishing.com/) and then viewed the source code (always my first action when I want to check a site’s setup and quality). First thing I noticed (apart from acres of whitespace) was lang=”en-US” in the html tag – not the best indication especially for a .com. That gave me an idea and I went back to Google and clicked on the “web results” link (I had searched on UK-only as I usually do for UK-based queries). Low and behold up came the site in number 1 spot.
So Google thinks that Edinburgh University Press isn’t a UK site. Could it just be that language setting? Let’s dig a little further, I next activated my Netcraft toolbar – ahha, they are on IBM servers in the USA, another poor signal and almost certainly a rather more important one. (I’ve seen many many .com sites failing to rank in the UK because of being hosted elsewhere)
Since it doesn’t look as if there has been any SEO done on the site – poor and duplicate title tags and no meta-descriptions – it’s a fair bet that they haven’t got a Webmaster Tools account where they could have told Google the site was UK, although that isn’t the whole solution by any means.
While musing on this situation of one of Scotland’s most important academic publishers not showing up in UK searches and whether I should try and contact the webmaster about it, I cast around pretty much on SEO autopilot checking various data, and having seen that there is a robots-noarchive setting on the home page, I checked the robots.txt file:
Oops! Seems they either don’t want indexed or are being somewhat badly advised!
Hang on, they were listed in the Google worldwide results…
So how are the search engines handling that? run a few site: commands:
Bing only lists 1 page with no details (although as usual they can’t count their totals – 2/2 of 150??)
Yahoo only lists 1 page with no details.
Blekko says there are 250 pages but doesn’t list any of them.
Google lists 901!! (and gives another nonsensical total of 45,000) and includes page content in the short descriptions. (At least they aren’t cacheing it)
So much for Google obeying robots.txt – seems they make their own minds up (not the first time I’ve seen this)
So the moral of this story is, be careful about your domain name suffix, be careful where you host your site, don’t tell people you speak American when you’re British, and don’t expect Google to follow standards or stay out of your website when you tell it to.