home | legal stuff | glossary | blog | search

 Legend:  new window    outside link    tools page  glossary link   

Hide in plain sight:
Tricks for protecting spam websites.

Note: I’m not an expert in DNS, so most of the info on this page represents conjecture on my part (backed up with some web research). Any corrections or improvements are welcome.

Resources for sending spam mail are nearly limitless, it seems (thanks largely to fleets of vulnerable home computers that spammers can turn into open proxies). When one or another of these mail-sending hosts is busted and closed off, the spammer is not terribly inconvenienced since he can usually find plenty of others to take its place.

Spam websites, however, are not quite as cheap or plentiful or expendible as spam mail hosts. Therefore, these must be more carefully protected.

For these reasons, spammers invest considerable effort in making their websites difficult for investigators to trace and report. Most of this effort involve various maniuplations of DNS, or else liberal use of redirection from expendible “portal sites.” On this page, we will look at some of these tricks.

DNS tricks

As I describe elsewhere, DNS (the domain name service) is the worldwide service that keeps track of the IP addresses associated with all the host names on the internet. DNS allows us to enter memorable domain names (like “rickconner.net”) in our browsers or e-mail programs, which are then translated (“resolved”) into IP addresses that the computer can understand.

In earlier times, spammers did not use domain names for their websites, instead calling them out by their raw IP addresses (like http://12.34.56.78/growdicks.html for example), so they had little or nothing to do with DNS. Nowadays, most spam websites are called out by domain names; spammers have now learned how to take advantage of certain peculiarities of DNS to confuse and stymie spam investigators, while at the same time allowing the suckers to continue visiting their sites.

Such tricks require access to internet resouces around the world, and a great deal of practical expertise in IP networking and DNS, so you can see that the folks who use these techniques are a very sophisticated and well-connected crowd, a breed apart from the chickenboners who used to account for most spam.

We now turn our attention to a few of these DNS tricks.

Spam websites located at multiple IP addresses simultaneously

Suppose we’ve gotten a spam message advertising a website called http://www.spamgoods.foo/. We want to trace this site to an IP address for spam reporting, and to do this properly we need to find the authoritative DNS lookup for the site (and not just the one given us by our local name server). Our first step is to find out which are the authoritative name servers for the spamgoods.foo domain, in this case using the nslookup -type=ns command (you can also use host and dig for this, as well as web-based tools such as http://www.dnsstuff.com/):

rconner$ nslookup -type=ns spamgoods.foo
Server: bandit.prismnet.com
Address: 209.198.128.11

spamgoods.foo nameserver = ns1.spamgoods.foo
spamgoods.foo nameserver = ns2.spamgoods.foo
spamgoods.foo nameserver = ns3.spamgoods.foo
spamgoods.foo nameserver = ns4.monkeybrain.foo
ns1.spamgoods.foo internet address = 10.103.14.92
ns2.spamgoods.foo internet address = 172.181.19.242
ns3.spamgoods.foo internet address = 192.168.13.99
ns1.monkeybrain.foo internet address = 10.78.133.68

At this point, then, we have what seem to be four different name servers (highlighted in orange) reporting as authoritative for the domain (and three of them happen to have names within the spamgoods.foo domain itself. The nslookup command also gave us the IP addresses for these four name servers (in so-called “glue records,” since the name servers share the same domain name and would otherwise be unreachable to us).

Now, suppose we were to query each of these name servers individually for the address of www.spamgoods.foo, and were to get back the following info:

ns1.spamgoods.foo reports 10.103.14.92
ns2.spamgoods.foo reports 172.181.19.242
ns3.spamgoods.foo reports 192.168.13.99
ns1.monkeybrain.foo reports 192.168.13.99

Note that not only do we have three different addresses for the web host, but we find that each of these is the same address as one of the name servers. What is going on?

When you find all these circumstances together in one case, you have a very suspicious website indeed.

In all likelihood, the spammer has simply set up website hosts or proxies at various different IP blocks (usually with offshore providers in places where anti-spam policing is lagging, or perhaps using hijacked “zombie” computers belonging to innocent home users); these website hosts also contain (or proxy for) basic name server applications that can be programmed to serve as authoritative DNS hosts for themselves, as it were.

What this buys for the spammer is simply redundancy — what signal engineers might call “space diversity” (if you imagine the set of all IP addresses as being a “space”) — so that a problem affecting any one of the hosts (e.g., a hosting provider shutting it down) will not affect the others.

When you try to load the spam URL with your browser, you will get exactly one of these hosts — that is, the one at the address returned by your local name server. Can you verify whether the others are operating as well? Possibly not.

Spam website IP addresses that change rapidly over time
(“rotating IPs”)

In addition to distributing their sites “in space” (by using multiple IP addresses at the same time), spammers can also distribute their sites “in time” by frequently changing the IP address pointed to by their authoritative servers. For example, you may look up www.swisswatchs.foo right now and get the address 10.99.29.187; if you look it up again in a couple of hours (or even a couple of minutes), you may get a completely different address. This provides evidence that the spammer is quickly “rotating” his site through a succession of IP addresses.

How does this work? Basically, the spammer continually reprograms his authoritative name servers to report new IP addresses for his site, and then sets a low cache TTL (“time-to-live”) value for these lookups. The TTL tells local name servers how long they should hang on to this record (in their caches) before refreshing it with another authoritative lookup. You can find out the TTL for a DNS lookup yourself by doing a bit of digging with dig. For example:

alu-g4pb:~ rconner$ dig rickconner.net

;; ANSWER SECTION:
rickconner.net. 14400 IN A 209.198.131.19

Here, in the answer section, we see that lookups for rickconner.net have a TTL of 14,400 seconds (4 hours); if someone looks up my site more than 4 hours after the last such request, the local name server will have to refresh its cache. This isn’t a big problem for anyone, since my site has a very low level of traffic compared to others. Heavily-used and stable sites like Google may have a TTL of several days, which saves the load on their name servers (from constant refreshing by local name servers).

Spammers who use rotating IPs usually set the TTL for their domains to be extremely short (e.g., as little as a couple of minutes), because this forces local name servers to refresh their cache every time the website is requested. This enables the spammers to alter their authoritative DNS info extremely frequently, thereby rapidly shuffling their sites among a large number of physical hosts. Here’s a typical example captured “in the wild” from an actual spam for a site named http://nertuil.com:

alu-g4pb:~ rconner$ dig nertuil.com

;; ANSWER SECTION:
nertuil.com. 180 IN A 82.53.44.121
nertuil.com. 180 IN A 68.41.72.126
nertuil.com. 180 IN A 138.238.177.25
nertuil.com. 180 IN A 69.245.255.124
nertuil.com. 180 IN A 82.245.80.89

Note the very short (180-second) TTL values for these lookups. Running the same command about five minutes later yields a (nearly) completely different set of addresses for the site:

alu-g4pb:~ rconner$ dig nertuil.com

;; ANSWER SECTION:
nertuil.com. 180 IN A 210.6.120.124
nertuil.com. 180 IN A 66.30.63.120
nertuil.com. 180 IN A 66.216.159.90
nertuil.com. 180 IN A 82.245.80.89
nertuil.com. 180 IN A 210.134.93.98

I tracked this particular lookup once every three minutes for two hours (using a simple shell script to repeatedly call dig), and found that the website was listed at 76 different IP addresses (possibly I might have found more if I’d let the script go on). A random check revealed that many of these addresses pointed to a wide variety of IP blocks, including one university and several large retail internet providers; this suggests that this site may be protected by reverse web proxies on “zombie” hosts. A more careful inspection of these addreses revealed that they were deployed more-or-less randomly over time.

Spam website IP addresses that “disappear” from DNS

Suppose you get no result (i.e., no IP address) when you try to do an authoritative DNS lookup on a spam website. This means that the site is no longer running or reachable, right? Well, maybe not.

Some spammers will “stuff the cache” by making requests for their site from name servers at various large ISPs in order to get these lookups into the local name server caches for these ISPs. Then, they simply turn off authoritative name service for the domain. In this way, suckers can still reach the site (at least for the limited duration of the cache records) while spam investigators will be unable to do the authoritative lookups they require in order to make credible reports (in such cases, the spammer will probably use a very long TTL so that his data stay in the cache for a good long while).

If you suspect that a particular spam site may be using this trick (i.e., you can get a local cached DNS lookup for the site, but not an authoritative DNS lookup), then you have one trick of your own to play: go to http://www.dnsstuff.com/ and use the “ISP cached DNS lookup tool” (this tool may now require a premium DNSStuff membership); this tool will look up the host name you provide at dozens of name servers operated by various ISPs around the world, and will tell you whether these ISPs have cached data for the host (and how much longer these data will be available before they are purged from the cache).

Don’t be surprised if this tool finds several different IP addresses among the cached data. However, as with the other ruses described above, it probably doesn’t pay you to try to load, confirm, and report all of the addresses found. You can really only deal with the address (if any) returned by your local name server, and you should leave the others to someone else.

Spam website cloaking

Another effective way to protect a complex and valuable spam web host installation is to hide it behind other simpler and more expendible web hosts. We might call this practice “website cloaking.”

This cloaking helps the spammer protect his primary website behind a phalanx of “portal” sites, often set up on “zombie” computers, web proxies, or free web hosting services like Geocities or Blogger or Googlepages. These portal sites provide three important benefits to the spammer:

Generally, spam investigators might be able to track down the portal sites and report them, but they may not know of the websites that sit behind them, or may not have the time, expertise, or resources to track and report these sites.

There are many means that spammers can use to accomplish this cloaking, such as:

Of these, the first two are by far the most common these days.

We’ll look at some examples below, but for the moment I note that the best way to spot cloaking activity is to use a web fetch tool (such as curl or wget) to get the HTML for the spam web page, and then examine the markup for signs of redirection, linking-out in frames, suspicious scripts, or other signs of cloaking.

HTTP-level redirection (via 301 status code)

Here’s a curl -i fetch (showing just the HTTP header) for the website named in a diploma-mill spam:

alu-g4pb:~ rconner$ curl -i http://e8.se/4bw
HTTP/1.1 301 Moved Permanently
Date: Thu, 17 Aug 2006 04:06:49 GMT
Server: Apache
X-Powered-By: PHP/5.1.1
Set-Cookie: PHPSESSID=cb4f68bd4d13108027b0d3294108c695; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Location: http://uk.geocities.com/fsbajdbskdw378e7/
Content-Length: 0
Content-Type: text/html; charset=iso-8859-1

Thanks to the curl -i option (which prints the HTTP header), we see that we got a 301 code from the server at e8.se (highlighted in orange), telling us that the site has moved elsewhere (highlighted in blue). The Geocities link, when visited, showed a typical diploma-mill website. One additional bit of info comes to light when you load the URL http://e8.se/ (without the trailing info): it leads you to a URL-shortening proxy service at tny.se (similar to the tinyurl.com service). This redirection appears to have been set up to protect the spammer’s Geocities website (since few investigators will go so far as to follow the redirection and report this site).

The URL given in the redirect message can also contain affiliate ID info, so that the spam website proprietor knows whom to reward for any sales generated from your visit to the site.

HTML redirect (using META tag)

Now, let’s look at a curl fetch of a spam pharmacy website:

alu-g4pb:~ rconner$ curl http://respicko.com

<html>
<head>
<meta http-equiv="refresh" content="0;URL=http://gropedit.com/">
</head>
<body>
<a href="http://www.variousus.com/cgi-bin/whole.cgi?podstavos=">_
</a>
</body>
</html>

We see that this HTML markup contains a META refresh tag that sends us immediately to another site (highlighted in purple); this represents an example of HTML or META redirection. Another use of curl at the target (redirected-to) website yields:

alu-g4pb:~ rconner$ curl http://gropedit.com/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML><HEAD><TITLE>HealthSuite</TITLE>

[ snipped ...]

<H3>Allergies</H3><P><A href="index.asp?p=241">Clarinex</A></P><P><A href="index.asp?p=248">Promethazine</A></P><P><A href="index.asp?p=250">Zyrtec</A></P><H3>Anti-Inflammatory</H3><P><A href="index.asp?p=264">Bextra</A></P><P><A href="index.asp?p=265">Diclofenac</A></P>
<H3>Antibiotics</H3><P><A href="index.asp?p=266">Amoxicillin</A></P><P><A href="index.asp?p=267">Amoxil</A></P><P><A href="index.asp?p=268">Biaxin</A></P><P><A

[ snipped ...]

I included only a small portion of the printout, but it clearly shows that there’s a website running, and it appears to be selling pharmaceutical drugs. So, in this case, we can conclude that this site is reportable for spam.

The HTML redirect described here is easy to do, since it does not require access to the configuration of the host web server; all the spammer has to do is simply to drop a META tag into his HTML markup. Therefore, this trick is most often used in the portal pages set up by spammers on free-hosting services like Geocities, where the users often cannot directly modify the behavior of the host web server. As with the HTTP redirect, the URLs in the META tags can include affiliate ID info.

Using <FRAME> tags to pull info into an expendible site

Here’s a live example (from another diploma-mill spam) that uses HTML frames to pull content from a protected spam website into a disposable portal site; I captured the markup for this website using curl.

alu-g4pb:~ rconner$ curl -i http://1slc.com/1/?r=1j
HTTP/1.1 200 OK
Date: Mon, 18 Sep 2006 23:17:30 GMT
Server: Apache/1.3.33 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.11 FrontPage/5.0.2.2635 mod_ssl/2.8.22 OpenSSL/0.9.7a
X-Powered-By: PHP/4.3.11
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

<html>
<head>
<title>1SafelistCentral Short URL</title>
</head>
<frameset rows="*,80" frameborder="0">
<frame name="page" src="http://uk.geocities.com/leaveyourworriesbehind654/">
<frame name="ad" src="ad.php" scrolling="no">
</frameset>
<noframes></noframes>

This vestigal bit of markup (in green) contains little more than an HTML frameset; into the top frame is loaded the actual spam payload, taken from the UK Geocities site highlighted in blue. If you were to view this page, however, you’d probably have no idea that you were dealing with a site from Geocities unless you dumped the markup to check for yourself. More than likely, the site at 1slc.com is going to absorb all the spam complaints, while the Geocities site will remain relatively shielded from the onslaught (interestingly, perhaps, this spammer appears to be the same party who used the HTTP redirection in the example above).

Also worth noting here is that although 1slc.com claims to be a link-shortening service (like TinyURL), it seems to act more like a hosting service:



 home | legal stuff | glossary | blog | search

 Legend:  new window    outside link    tools page  glossary link   


(c) 2003-2008, Richard C. Conner ( )

14589 hits since March 28 2009

Updated:Thu, 26 Jun 2008

Document made with KompoZer