electro acoustic expressionism
nodepet
January 7th, 2008

Yahoo loves kurzfilm and can’t let it go…

Filed under: Web — olliver @ 22:01 h

This romance started back in December:
One fine day, shortly before Christmas, I noticed some weird requests by Yahoo’s crawler:

74.6.28.164 - - [23/Dec/2007:08:25:16 +0100] "GET /kurzfilm/ HTTP/1.0" 301 307 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.28.164 - - [23/Dec/2007:08:25:17 +0100] "GET /kurzfilm/ HTTP/1.0" 404 275 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

Of course there was no directory called “kurzfilm” on my server, but it seemed some stale link was pointing to it and Yahoo checked to see whether there is something new to discover. If you look closely you spot a 301 redirect before the actual 404 response. That is because I use mod rewrite to redirect any request that does not use “www.mydomain.com” as host to that location first, in order to ensure, that only this version of my domains will appear in search results. After some research I was even able to locate the origin: Some Dutch website used to link to it using the server’s ip address. Unfortunately this was a fatal mistake as Yahoo is now querying this non existent “kurzfilm” directory over and over again.

Google behaves different in that regard: Once it cannot find anything there it soon discards the url and moves on. Also it obeys 301 (moved permanently) redirects and discards the previous destination after a while. But Yahoo?

Yahoo loves “kurzfilm” in the morning:

74.6.28.164 - - [06/Jan/2008:09:08:59 +0100] "GET /kurzfilm/ HTTP/1.0" 301 236 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.28.164 - - [06/Jan/2008:09:09:01 +0100] "GET /kurzfilm/ HTTP/1.0" 404 5883 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

And “kurzfilm” in the evening:

74.6.28.164 - - [06/Jan/2008:20:02:50 +0100] "GET /kurzfilm/ HTTP/1.0" 301 236 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
74.6.28.164 - - [06/Jan/2008:20:02:52 +0100] "GET /kurzfilm/ HTTP/1.0" 404 5883 "-"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

Did you notice the 301 redirect? That means Yahoo is still using the server’s ip address for its request, despite the 301 redirect, which should normally signalise that a request be permanently turned to the new destination instead. But then again it would not be Yahoo and so I shall expect to find “kurzfilm” in my logfiles around this time next year, too. Maybe I should create this “kurzfilm” directory already, just for Yahoo: The “kurzfilm” search engine # 1 :-).

Comments (0)

No Comments »

No comments yet.

Leave a comment

Posting comments requires Javascript to be turned on.