So, a friend pointed out today that when you search for “Rolling Stones” on Google, you may not get the results you were expecting:
Hmm, I wonder what the top result is for “Ninja Turtles“. Let’s see:
Yup, that Garfield cat sure is popular. For completeness’ sake, I’ll point out now that searches for both “Teenage Mutant Ninja Turtles” and even “Ninja Stones” also show Garfield.com as the top hit. If you find more, let me know. In the mean time, I’m really curious as to why this is happening, so let’s try to figure that out.
So, why does Google think Garfield.com and NinjaTurtles.com are the same site? Well, might they be hosted on the same machine?
$ host www.ninjaturtles.com www.ninjaturtles.com has address 220.127.116.11 $ host www.garfield.com www.garfield.com is an alias for ucomics.com. ucomics.com has address 18.104.22.168 ucomics.com has address 22.214.171.124 ucomics.com has address 126.96.36.199 ucomics.com has address 188.8.131.52 ucomics.com has address 184.108.40.206 ucomics.com has address 220.127.116.11 ucomics.com has address 18.104.22.168 ucomics.com has address 22.214.171.124
Nope. Neither is anything about the Rolling Stones. Okay, just for fun, let’s look at the robots.txt file on Garfield.com:
# robots.txt for http://garftest.uclick.com/ User-agent: * Dissallow: /
That’s weird. First of all, they misspelled disallow as “dissallow”. A quick check of other sites with disallow misspelled in their robots.txt files shows that Google ignores the misspelled disallow directive. So, the file should have no affect on indexing.
Maybe Google’s is using a hash of the domain name to identify identical sites? Trying 16 common hashing algorithms (MD5, RIPEMD-160, Tiger, SHA-1, SHA-256, Adler32, CRC32, etc.) of various permutations of the domain names with and without punctuation, I didn’t find any collisions. That doesn’t rule out hash collisions using different (say, homegrown) hashing algorithms, but I’ll move on for now.
Next, scanning the results returned for Garfield on various search engines, I don’t see any evidence of Googlebombing the terms “Garfield”, “Rolling Stones”, or “Teenage Mutant Ninja Turtles”.
Okay, that’s a stretch, but I’m stumped. Anyone else have any ideas?