Archive for August, 2007

Paid link controversy and thoughts about ranking algorithms

Thursday, August 30th, 2007

There’s been much reporting of a session at the SES conference in the USA which discussed Google’s stance that paid links should always carry a nofollow tag or risk being penalised. On the expert panel it seems to have been largely a case of Matt Cutts versus the rest, and the audience seems to have been firmly aligned with the anti-Google stance of the rest.

This whole situation not only brings up some questions about how Google can be impartial when they stand to make the profits from the effects via Adwords, but also raises questions about the whole basis of the main Google algorithm.

A great deal has changed since the original algorithm appeared, based on the conceptually simple but mathematically complex idea of using links as a measure of value. There are now, of course, many factors at work in the algorithm but two very important ones are the true PageRank of sites which link to you, and the anchor text used in those links. In very simplistic terms, the former indicates the strength of the link while latter indicates the relevance to a particular set of search terms. One well-known SEO commentator, Michael Martinez of SEO Theory, thinks that Google should stop passing the anchor text, and that this would largely solve the paid text link problem. Now Martinez can be rather outspoken and seldom suffers fools (or amateurs pretending to be professionals) but he’s usually worth listening to even if you don’t always end up agreeing with him. So what would the effect be of dropping the link text from the algorithm? Indeed how easy is it to predict the effects of any change to it?

The algorithm now appears to be so complex that it’s almost like observing a natural world system - and some pretty unpredictable things can happen to them. For instance f you decide to play god and wipe out an irritating insect then the creatures that feed on them will be affected. Some may themselves be drastically reduced in number while others may be able to switch to another food source which in turn affects another species. Movements may occur in populations which then allow other movements and changes in other predators and prey. Similar effects can sometimes be seen in search engines - filters designed to get rid of spammy sites can end up affecting perfectly good sites - remember Big Daddy?

So let’s say we remove link text - that will remove a fair degree of relevance, with mainly the subject matter of the two linked pages being left to determine that. If that happens then it will no longer matter as much that links are from related sites, so existing poor links may increase in value and webmasters will be more tempted to follow the “get loads of links from anywhere” route. Another possibility is that PageRank would become relatively more important again so we might see a return to the tedious reciprocal link requests that insist on a minimum PageRank for the link back, as well as those PR calculators for working out how to concentrate PR on pages by manipulating the navigation (usually making the site unusable in the process).

This whole area is one that I feel needs discussion and thought by people from different perspectives to have any chance of coming to a sensible conclusion. Anyone else got any thoughts on the likely effects? Do you agree with Martinez or do you think his solution is too simplistic to work? Would Google listen to us anyway or are we at their mercy with their experts playing god with the search results? What sort of search engines do we want to see in the future?

Personal search - what is it for?

Saturday, August 25th, 2007

We’ve had a few weeks now to get used to Universal Search, although the effects still seem minimal on this side of the Atlantic, but for those who stay logged in to their Google accounts (unlike me) what about Personal search? Initially hailed by a host of bandwagon jumpers, there is the beginnings of a suggestion that not everyone is so enamoured anymore. The big question is what is it for?

Supposedly it learns what you search for and what results you find useful so it can tailor future results to your preferences. But hang on; if you were happy with what you got the first time you searched on a phrase why are you searching again? Surely to try different sites because you didn’t get everything you wanted from the first sites. If you are presented with more of the same you’re likely to find them less useful too. To me it seems as if the only people who will find this useful are those who use search engines as a universal interface - doing the same searches time and again because they can’t remember or don’t bookmark the sites that they like. Yet these could very well be the people who aren’t savvy enough to take out accounts in the first place.

We could end up with a situation where the experts and the net illiterates are the ones who don’t use personal search while the ones in the middle do. But only the experts will get the results they want because they know when to turn off their accounts. All seems a bit Alice through the Looking Glass doesn’t it?

In the meantime I’m keeping my account signed out except when I’m using Analytics or Webmaster Tools.

Google privacy concerns go mainstream

Wednesday, August 22nd, 2007

You know that worries about internet security and privacy are becoming widespread when they start appearing in the Metro. For any overseas readers this is a UK free newspaper which is given away at rail stations and on buses, and while awaiting my cancelled train this morning I read a full page article in which they included statements by Chris Hoofnagle of the Electronic Privacy Information Center in the USA and by Kevin Bankston of the Electronic Frontier Foundation. Both highlighted the large amount of personal information held by Google and that this information gives a substantial picture of your character and beliefs as well as private data. According to the article there are EU suggestions to add privacy warnings to search sites in the style of cigarette health warnings - not one I’d come across. As well as highlighting Google’s purchase of Doubleclick it also mentions another purchase of a genetic profiling company, though whether this is slipped in for dramatic effect is hard to say as no further details are given.

The Google view is presented in such a way that it sounds a bit wooley and unconvincing, and there are numerous mentions of Orwell’s 1984 and Big Brother as well as the KGB and Stasi. (wonder if they were reading my previous post!) They also mention the recently announced plans to extend search into areas where you can ask very personalised questions such as “what shall I do tomorrow” and the Eric Schmidt comments about being at the very early stages of compiling information.

So is this just a bit of journalistic bandwagon jumping or a sign of the big Google backlash? Regular readers will know that I have reservations on the subject of keeping personal data online and the tracking of online activity. I don’t use Gmail or other online data storage mail systems, and I don’t use online bookmarking. Whether it’s politicians or big business there are too many people I don’t trust to have access to our private lives. Maybe I’m not in such a minority after all.

Boogie on up

Tuesday, August 21st, 2007

Google rankings changed fairly radically again for me on one site this weekend (previously discussed on this post), and this time most of the results went up; with the exception of one for the most generic keyword phrase which decided to stay the same for once (fingers crossed for next week). The Edinburgh-based terms all came back too.

Many of the terms are now at or near their highest level so the trends are still upwards, as would be hoped for a site that’s only been around for about a year and a half, it’s just that they are kinda variable in getting there recently. Is anyone else seeing this sort of major oscillation? Working out whether it’s site specific, datacentre related, or a sign of a more fundamental algorithm tweak is proving difficult on this occasion.

Checking Supplemental pages

Saturday, August 18th, 2007

In an earlier post I spoke about Google’s planned removal of the supplementals tag from their results. They went ahead with it and it caused quite a stir amongst webmasters and SEOs. Various people have looked for alternative ways to discover which of their pages are still in the supplemental index. US SEO firm Bruce Clay have come up with the following query -

-site:www.mysite.com/* site:www.mysite.com/

which so far seems to do the job. Thanks guys, let’s hope Google don’t pull it.

I noticed that the same, or at least very similar results can be deduced in Webmaster Tools by looking at the Internal Links report. Not all your pages are shown, and if a page isn’t listed in there then it’s a fair bet that it’s a supplemental. On my sites it seems to be the case that pages with very few internal links are still ok if they have a link from the home page but if they only have links from subsidiary pages and no external inbound links then they won’t have enough PageRank and will fall into the “Dungeons of Doom” (cue maniacal laughter), where their chances of ranking for anything will be poor. That seems to match what we know so far of the reasons for supplementals.

Now why can’t it all just depend on quality?

The Search for Spock

Thursday, August 16th, 2007

If you’re of a paranoid disposition, or maybe even just mildly suspicious of Big Brother tactics, then the news that Spock intends to build a profile of 6 billion earthlings will have set your antennas twitching. No, not the supremely logical Vulcan first officer / ambassador from the 23rd century but a new search engine of the 21st.

Spock is intended to do for people-searching what Google does for general search. Their intention is to trawl the social sites such as Myspace, Facebook, etc. to build up detailed information on people all over the world. CIA eat your heart out! For those of us old enough to remember when privacy and the ability to walk down the street without being photographed by a battery of CCTV cameras was something we took for granted and made us different from the eastern block with their KGB and Stasi, this has worrying overtones. Even if you don’t keep your email online with the likes of Gmail, or your photos on Flickr, or your bookmarks on Del.icio.us you’ve probably still joined enough discussion groups, or posted on usenet, or commented on blogs for a pretty big dossier to be put together by anyone, be they press reporter or politician, looking for an easy story or scapegoat.

Made a left-wing political comment in your teens? A non-pc comment during drunken online banter? Joined a swingers group? It’s all potentially retrievable, and free to be twisted by anyone with an axe to grind. Of course it could be argued that much of this is retrievable already, but such a dedicated search system is bound to take on new and potentially more invasive methods in order to differentiate it from the competition. Will the social media sites be able to opt out of this process if their members demand it? Will the Spock spider pay attention to robots.txt?

So how happy would you be with total availability of all your online activities. Expect a rise in popularity of anonymous proxy surfing facilities and a renewed use of PGP to encrypt emails. Me, I’ll be polishing up my firewalls and being very careful with the trails I leave.