It was Google that told me how to build OptiLink

In our last episode, John Heard told me to go read the original Google paper. I did. Over and over. And “the secret” (to ranking, not the movie!) was totally plain and obvious, well, at least to me.

These guys are programmers.

So am I. We literally speak in code, and what they said in their technical description of Google told me precisely what to build! [Thanks guys!]

In fact, OptiLink is not all I discovered that day, but we’ll get to that later. For now, let me show you by example the thought process that launched my career in SEO and has sustained me ever since.

Programming is a practice of linguistic precision. Words must mean very specific things to us because that is what they mean to the very fast, but equally stupid, machines that we code.

In almost all cases, people view us as “pedantic” – I mean what really is the difference if you say domain, URL, link, anchor, link text, back-link, whatever … humans will know what you mean from context, right?

Not so the machine.

So now let’s read that paper those grad students wrote with a programmer’s precision.

The phrase “anchor text” is used 13 times and “link text” mentioned 5. Here are some of the best bits from my original frantic markup:

  • “This idea of propagating anchor text to the page it refers to…”
  • “…anchor text can help provide better quality results.”
  • “Aside from PageRank and the use of anchor text, Google has several other features.” [other means minor IMHO]
  • “There are two types of hits … Fancy hits include … anchor text … Plain hits include everything else”

I could go on, but this snippet is the real crusher:

Google considers each hit to be one of several different types (title, anchor, URL, plain text large font, plain text small font, …), each of which has its own type-weight. The type-weights make up a vector indexed by type. Google counts the number of hits of each type in the hit list. Then every count is converted into a count-weight. Count-weights increase linearly with counts at first but quickly taper off so that more than a certain count will not help. We take the dot product of the vector of count-weights with the vector of type-weights to compute an IR score for the document. Finally, the IR score is combined with PageRank to give a final rank to the document.

This was their entire algorithm. If you don’t understand all of it, that’s okay, but aren’t the key factors absolutely clear? It was to me, and it fueled everything I did in SEO for about 7 years.

More than that, they gave a real example of how to rank with ONLY link text. It’s in section 5:

Notice that there is no title for the first result. This is because it was not crawled. Instead, Google relied on anchor text to determine this was a good answer to the query. Similarly, the fifth result is an email address which, of course, is not crawlable. It is also a result of anchor text.

 OMG! I can rank with anchor text alone!

If was right there – plain as day – but no one was teach this and there were no tools to analyze anchor text.

I couldn’t believe it. What a huge opportunity. But I was scared. What if someone else got to market first? How fast could I get a product to market? And what if I got it all done and it didn’t work the way the guys said it does? It was a risk, but I had to try – fast!

From start to finish it took me a brutal 6 months. Tomorrow I’ll tell you the lessons I learned in the process and what I’d change if I did it again.

Speak Your Mind

*