electro acoustic expressionism
nodepet
January 23rd, 2008

Harvester of Sorrow from Quebec

Filed under: Spam — olliver @ 13:13 h

Last night someone (ab)using a cable connection in Quebec left a long trail of entries in my weblogs, because the harvester’s link extraction mechanism was broken and resulted in lots of erroneous requests. The user agents were variants of IE explorer strings, some of them apparently truncated (maybe they exceeded the maximum allowed length in the spamware) and thus easy to distinguish from genuine visitors. François had encountered these visits before me, as you can read in this forum thread.

Here is a sample from my weblogs to give an idea about how these requests look like:

24.200.17.2 - - [23/Jan/2008:03:08:35 +0100]
"GET / HTTP/1.1" 200 6547 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
24.200.17.2 - - [23/Jan/2008:03:08:37 +0100]
"GET / HTTP/1.1" 200 37342 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET"
24.200.17.2 - - [23/Jan/2008:03:08:39 +0100]
"GET /category/misc/ HTTP/1.1" 200 12797 "-"
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1"
24.200.17.2 - - [23/Jan/2008:03:08:40 +0100]
"GET /feed/ HTTP/1.1" 200 33547 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET"
24.200.17.2 - - [23/Jan/2008:03:08:42 +0100]
"GET /2007/12/ HTTP/1.1" 200 36766 "-"
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"

Note the randomly changing user agent. Whois reveals the following about 24.200.17.2 (modemcable002.17-200-24.mc.videotron.ca):

CustName:   Videotron Ltee
Address:    300 Viger Est
City:       Montreal
StateProv:  QC
PostalCode: H2X-3W4
Country:    CA
RegDate:    2006-06-28
Updated:    2006-06-28

NetRange:   24.200.17.0 - 24.200.17.255
CIDR:       24.200.17.0/24
NetName:    VL-D-MM-18C81100
NetHandle:  NET-24-200-17-0-1
Parent:     NET-24-200-0-0-1
NetType:    Reassigned
Comment:
RegDate:    2006-06-28
Updated:    2006-06-28

I am not certain about the origin, though: It is possible that these requests came from the actual spammer. But this might as well be some zombified Windows machine on autopilot, which is now part of a botnet and serves as socks proxy to those who have good reasons to conceal their identity. Neither the ip addresses in François’ weblog nor that one here return any search results in Google, as one would expect from known sources of abuse. Often this signalises an individual or outfit that utilises a botnet with “fresh” proxies. Sometimes this is automatically done by malware which includes a scan and spam engine as payload.

As a workaround, I added the following SetEnvIf rule to my list of fake IE user agents:

SetEnvIfNoCase User-Agent "\.NET( CLR( [0-9.]{1,9})?)?$" block

The spamware’s broken link extraction is good for another rule:

SetEnvIfNoCase Request_Uri "//|%20title=%22" block

The “//” check within Request Uris also eliminates a lot of probes for vulnerable scripts that originate from compromised servers. In case you are worried about any http requests being blocked as result of the rule, rest assured that the protocol string http:// is not part of the Request Uri environmental variable.

Comments (1)

1 Comment »

  1. Hello Olliver,
    Back on this old post about the Google “injections” in my inputs fields : I found that http://googlewebmastercentral.blogspot.com/2008/04/crawling-through-html-forms.html and thnink that you’ll like this.
    I will try to add something to my robots.txt.

    Comment by françois — October 20th, 2008 @ 23:50 h

Leave a comment

Posting comments requires Javascript to be turned on.