SpiderFriendly.co.uk

Google Guidelines: What I Would Change If I Could

Irina Ponomareva

Category: Internet and SEO: Rambling Thoughts

With all the latest talk about [1] Google Webmaster Guidelines, their real purposes and the ways to improve them, I had no other option other than to re-read them (after a very long break).

I've never felt tied down by any guidelines introduced by the search engines; at the very beginning of my SEO career I briefly read them to get the idea, then forgot about it all for months and months. The point is, whenever people mention guidelines, they most likely mean the "Quality Guidelines" part, not the technical part of them. Many view guidelines as the only trusted source of information on what the engines approve of and what they consider search engine spam. I adhere to another source, called "common sense and logic", and have my own personal guidelines which are strict enough to keep me safe from anything dubious and insecure. After all, those guidelines are for webmasters, and not all webmasters are necessarily SEOs. Those who don't do SEO for a living might need those guidelines to avoid making dangerous mistakes out of ignorance. Those SEOs who like trying the limits might benefit from a re-reading of the guidelines regularly, in order to stop them pushing too far. But why would I need to read them? There is absolutely no reason to.

Or so I thought.

An overlooked part

Yet there was a very good reason to read the guidelines, one I completely overlooked. It is the SEO forum circuit, this weird and wonderful phenomenon of the modern Internet-obsessed society. SEOs of all hat colours just love to discuss Google guidelines; being an integral part of this restless community, I had no option other than to re-read the page causing so much noise, in order to be able to join all those hot discussions.

And they were indeed hot.

- "When morality is equated to compliance with SE guidelines, common sense and logic are notable absentees."
- "Search engine guidelines are for dorks."
- "Search engine guidelines are for newbies, not SEOs."
- "I'm not an SEO, just a newbie, and I achieved great results by being compliant to Google guidelines."
- "Google guidelines could have been better written. The way they are now, they are not clear."

Etc., etc., etc. ... And unfortunately, as often happens in online communities, reasonable points and suggestions finally got drowned in drivel and personal insults, and the threads ended up being locked. But the question - how could we all possibly improve Google's guidelines (as well as all other guidelines) - remained unanswered.

Looking at the same page again

Now, as renewed attempts are being made by webmasters and SEO professionals to sum up the Best Practices approach to the profession, I'm re-reading [2] the same Google's page again. Yes, I have no doubt that these lines could have been better written, but for the most part, I don't think it's critical to change them right now. At least I can understand what the folks at Google are trying to tell me, although I'm not sure all webmasters will understand them (for example the "Don't employ cloaking" part). Considering that our industry still can't agree upon the exact definition of cloaking, what should we expect from a newbie site owner who is just making his/her first steps in learning HTML? I like the "Technical Guidelines" part, especially the recommendation to support the "If-Modified-Since" HTTP header and the link to the page describing robots.txt standards - [3] The Web Robots FAQ... I have to laugh at their recommendation to submit sites to Google once it is ready, as we all know it's a totally obsolete measure that does nothing. But I could live with all this.

This tiny bit...

But there is one tiny bit that I can't live with. No way.

Part of the "Quality Guidelines - Basic principles" part of the page says:

"Don't participate in link schemes designed to increase your site's ranking or PageRank. In particular, avoid links to web spammers or "bad neighborhoods" on the web, as your own ranking may be affected adversely by those links."

Let's stop here. No, I'm not questioning the validity of the statement. But if there is one thing in the web world that's totally unclear, it's this recommendation to webmasters given by the most popular search engine that exists today. I'd like to stress this again to all webmasters, not just to SEO gurus.

"Don't participate in link schemes" is written as if everyone knows what "link schemes" are. Many webmasters are sure to be confused by this advice, though the smartest are sure to figure out quickly that a link farm (where you link to everyone here, and everyone will link to you automatically) or a pyramid (where all sites from group A link to sites from group B, then all sites from group B link to sites from group C, then all sites from C link to D, and site D receives the highest PageRank) are quite likely to fall under "link schemes". But what about a simple link exchange? Is this a scheme?

Worse still is the advice to "avoid links to web spammers". Dear Google, I would gladly avoid it, really! Moreover, the day when each and every spammer is banned from all existing search engines will be the happiest in my life, and if you need my help to make this day closer, just contact me and tell me what to do. But if you - with all your smartest programmers and bottomless resources - can't neutralise all the variations of spam that exist, are you expecting each and every site owner to be able to detect spammy sites? The majority of common webmasters still think that the word "spam" means unsolicited emails only, and have no idea about SE spam as such. And now you are telling them to avoid linking to web spammers and add "as your own ranking may be affected adversely by those links", thus creating more panic.

We all know that even the most experienced SEOs sometimes fail to detect spam on sites, so skilful have certain spammers become in hiding their tricks. Common webmasters become totally helpless in this situation, sandwiched between Google and the spammers, desperate, lost and abandoned by their friends.

Define it!

First of all, a totally comprehensive definition of "web spammers" should be given within the guidelines, along with basic advice on how to detect the commonest sorts of spam. "Use Ctrl+A to highlight all content on the page, and certain types of hidden text will become visible. Look at the code of the page to find all other types of hidden text/links: if you see text in the code that doesn't render in your browser, the chances are it is included to deceive the search engines". "Try checking Google's cached version of the page; if it differs significantly from what you see, the site is probably using cloaking". "Too many links at the bottom of a page pointing to other domains indicate that the site is heavily cross-linked with other sites to artificially inflate its link popularity" - these are just a small number of possible explanations Google could be giving to webmasters trying to do everything the right way and to comply with the guidelines.

Give us the tools!

While asking us to avoid linking to spammy sites, Google might have helped us detect them, instead of making our task harder. For instance, the link: command is now broken and shows only a tiny part of pages that link to a site; according to Google, it's been done to cure us from the backlink obsession, but at the same time it makes it much harder to detect dubious link patterns. We have to use MSN to investigate backlinks, but I can't see Google adding this recommendation to their guidelines. After all, the engines compete with each other.

And why not mark banned sites with a special sign in the toolbar, for those who don't know how to detect banned sites? PR0 is certainly not enough, because it can also mean a new site.

How much is too much?

Nearly all sites that link out are in danger of running into at least one bad neighbourhood sooner or later, for the reasons I described above. If the engines considered it a sufficient reason for a ranking penalty, the only result would be the Net without links (actually, not the Net anymore, but a useless bunch of orphaned sites). Nobody will be interested in it, and so the cases when linking to bad neighbourhoods can cause a penalty should be categorised and described within the guidelines, to avoid unnecessary panic. Is it more than a certain percentage of all outbound links being spammy that causes a penalty? If so, what percentage exactly? How does the fact that the dubious link is placed sitewise change the threshold value? Do the sites get penalised for linking to sites containing well hidden spam? Does the fact that the site is spamming but hasn't yet been caught make it a bad neighbourhood?

I strongly believe that all these questions just need to be answered, and soon, otherwise the best part of the Net will collapse.

The responsibilities

While Google and other engines are responsible for keeping their guidelines clear and concise enough to be easily understood and followed, SEO/SEM practitioners are responsible for the ethics in the industry and for industrial guidelines accepted at least by the best part of our professional community. Our industrial guidelines have yet to be written, and if and when they are ready, it is very important that the engines approve of them, which means they should run basically along the same lines as the SE guidelines themselves.

Of course, the spammers will refuse to comply with the industrial guidelines, as they now refuse to obey the engines' guidelines. But it is not a reason not to have industrial guidelines. The time has come for them to appear.

This article is actually my attempt to contribute to our common task. Whether accepted or rejected, I hope the suggestions I have given will at least cause some good brainstorming and, at the end of the day, result in something useful. Because if we don't do the work, nobody will do it for us.

Links in this article:

[1] Google Webmaster Guidelines: http://www.google.com/intl/en/webmasters/guidelines.html
[2] the same Google's page: http://www.google.com/intl/en/webmasters/guidelines.html
[3] The Web Robots FAQ...: http://www.robotstxt.org/wc/faq.html

Click here to print.