“…Google will determine…”? Not on my site!

In the StomperNet forums today I responded to a member who noticed a Google post here. Reproduced here is my acidic response.

That was the most useless, vague, non-actionable and *irresponsible* post I have EVER seen from Google. It looks like something from webmasterworld or the warrior’s forum. The examples used are just plain stupid and the sweeping generalization they make about Google somehow figuring out URL parameters is dangerously silly.

  1. No one would consider rewriting a (so-called) dynamic url into a "static" one while retaining the session id. I mean DUH! If you are smart enough to even be able to enable mod_rewrite how could you not know to turn off session ids when serving content to bots? Ridiculous example that serves to paint all rewriting as somehow dangerous. Worst still, why would anyone rewrite like the example shown? That’s plain stupid.
  2. " … Google will determine which parameters can be removed …" — You have got to me Sh*t**g me! Is there anyone who can spell S-E-O that would like to just simply trust Google to "determine" what URLs should be the same and which should be different?? Not me thanks. My site. I’ll decide. If they get it wrong, you get flagged with widespread duplicate content and they don’t tell you about it.
  3. They leave completely unanswered the OBVIOUS (just look at SERPs) problems they have today with session ids — not so good at "determining" after all, eh? At every single StomperNet Live event we’ve held, I have reviewed at least one site that had pages indexed at Google showing multiple different session id values. This is a widespread problem for sites that serve session ids to bots and for Google to publicly post about "dynamic" URLs and sweep this under the rug while vaguely claiming to handle it borders on misrepresentation.
  4. They also don’t say a damn thing about parameter order — another place they fail COMPLETELY to "determine". Example: p1=v1&p2=v2 leads to the same content as p2=v2&p1=v1 and this is a REQUIREMENT of the HTTP spec (named parameters are NOT positional so may appear in any order) but Google treats these as different URLs and will ignorantly and incorrectly index both URLs as different pages. This problems appears in several CMSs today, Endeca in particular has it bad.

Speak Your Mind

*