Legend:  new window    outside link    tools page  glossary link   

Finding the mail host(s) that sent the spam

What we’re doing on this page: Using the information in the exposed mail header to find which host(s) sent the spam.

Once you’ve exposed the header of your spam message, you can then read it to identify the host (or hosts) that handled this mail on its way from the spammer to you.

To do this, we will focus on the lines in the header that begin with “Received:”; there should be at least one of these lines, one for each SMTP handoff between mail hosts, and together as a set, they describe the path taken by the message.

You’ll usually have either one or two hosts to find, depending upon the method used to transmit the spam:

Typical direct-to-MX spam

Let’s start with an easy example, a spam in which the spammer forged the header information and included a bogus routing line. As we will shortly see, this spam represents an example of “direct-to-MX” spam, the most prevalent type these days. Our goal here is to find the IP address of the machine that delivered the mail to my ISP's domain.

Here are all three of the Received: lines as they appeared the header of this message:

Received: from defender.io.com (defender.io.com [209.198.128.79])
   by postoffice.prismnet.com (8.13.4/8.13.3) with ESMTP id
   j9MMXLnh051636 for address hidden;
   Sat, 22 Oct 2005 17:33:21 -0500 (CDT)
   (envelope-from address hidden)

Received: from -1214250648 ([210.116.242.207])
   by defender.io.com (8.13.4/8.13.3) with SMTP id j9MMWT4q007152
   for
address hidden; Sat, 22 Oct 2005 17:33:19 -0500 (CDT)
   (envelope-from 
address hidden)
Received: from mcimail.com (-1215863112 [-1215979528])
   by derechoshumanos.com (Qmailv1) with ESMTP id 678940028C
   for
address hidden; Sat, 22 Oct 2005 18:33:45 -0400

There really are just three lines here — I “folded” (indented) them and colorized them for clarity. I also defaced any e-mail addresses where they appeared (to keep spambots from re-harvesting them).

Together, these lines give us a history of the progress of the message from when it was first sent (the bottom line) to the point where it was received by a mail host within my domain (the top line). Each line indicates a from-host (the computer that handed off the message, or the “pitcher”) and a by-host (the computer that received the message, or the “catcher,” also the machine that was responsible for writing the particular header line into the message).

In a properly-formed mail header, these records should form an unbroken chain, with the “by-host” on one line being the same as the “from-host” on the very next line above it. In other words, the Reader’s Digest Condensed Version of a typical honest (unforged) mail header would look something like this:

We see that this header describes the transfer of this imaginary message from host–A.origin.goo (where it starts) to host–B.elsewhere.bar, then on to host–C.somewhere.foo, and finally on to host–D.here.fum (whence it was picked up by the recipient’s own computer). Each relaying host uses the same name for both its roles (as a by-host and as a from-host), so it is easy to trace the message.

Stripping out the bric-a-brac from the header of our spam specimen to match the diagram above, we find:

Received:
   from host defender.io.com  by host postoffice.prismnet.com
Received:
   from host
“-1214250648”    by host defender.io.com
Received:
   from host mcimail.com      by host derechoshumanos.com

… which, if you are observant, should already clue you in to some problems with this message.

So much for the condensed version. Now, let’s take a closer look at these lines. I find that the best plan is to start from the top (i.e., the latest line in time sequence) and work backwards in sequence, because the top lines are more likely to be trustworthy and thus won’t be a waste of your time.

  1. In the top line, a machine named postoffice.prismnet.com received the mail from another host at 209.198.128.79 that identified itself with the name “defender.io.com.”

Ask DNS for the IP address for this host name...
[G4733:~] rconner% host defender.io.com
defender.io.com has address 209.198.128.79

Ask DNS for the host name associated with this IP address...
[G4733:~] rconner% host 209.198.128.79
79.128.198.209.in-addr.arpa domain name pointer defender.io.com.

  1. In the second line, the machine defender.io.com received the message from a host at 210.116.242.207, identifying itself with the “HELO” host name “-1214250648

Get the address from the name...
[G4733:~] rconner% host -1214250648
host: illegal option -- 1
<<rest of error message snipped>>

Get the name from the address
[G4733:~] rconner% host 210.116.242.207
Host 207.242.116.210.in-addr.arpa not found: 2(SERVFAIL)

  1. In the last line, a machine identified as “derechoshumanos.com” received the mail from a machine at the address “-1215979528” which identified itself as “mcimail.com

We are now finished with our header analysis work. Here’s what we learned:

At this point, we are now ready to find out what internet service controls 210.116.242.207, and therefore to whom we can send a spam report. You can go straightaway to the next step in the process to learn about this, or you can keep reading to look at these headers a bit more closely (or else you can skip down and have a look at an open-relay spam message).

The grubby details

Normally, the procedure above is all you need in order to identify the mail hosts responsible for sending spam. There is other infomation to be gleaned from headers, however, if you’re curious. Let’s back up now and take a closer look at these three lines to see what else we can learn that might be of interest.

The top line

Here’s the top line of the header again. As you may recall, this line showed an internal handoff of the message within my ISP’s domain:

Received: from defender.io.com (defender.io.com [209.198.128.79])
   by postoffice.prismnet.com (8.13.4/8.13.3) with ESMTP id
   j9MMXLnh051636 for
address hidden;
   Sat, 22 Oct 2005 17:33:21 -0500 (CDT)
   (envelope-from 
address hidden)

We can break this line up into individual pieces to see exactly what each tells us:

from defender.io.com (defender.io.com [209.198.128.79])

by postoffice.prismnet.com (8.13.4/8.13.3)

with ESMTP

id j9MMXLnh051636

for address hidden;

Sat, 22 Oct 2005 17:33:21 -0500 (CDT)

(envelope-from address hidden)

The second line

Received: from -1214250648 ([210.116.242.207])
   by defender.io.com (8.13.4/8.13.3) with SMTP id j9MMWT4q007152
   for 
address hidden; Sat, 22 Oct 2005 17:33:19 -0500 (CDT)
   (envelope-from 
address hidden)

Here’s the second line, which we identified above as being the last valid line in the stack, the line showing the spammer leaving the mail on my domain’s MX host. I’ll leave the full parsing of this line to you as an exercise, but let’s take a look at the from-host info:

from -1214250648 ([210.116.242.207])

According to this, the from-host gave its host name in the HELO command as “-1214250648,” which is an obvious forgery since it does not fit the required patterns either for host names or IP addresses. The by-host, however, collected the “TCP-Info,” the actual IP address of the from-host (in the parentheses). We don’t see any confirmation of the host name for this address, but this is probably because no host name is registered for it within DNS (as we saw above when we used the host command for a reverse DNS lookup on the address). As I noted, this is an example of a “forged HELO,” which is pretty much universal in spam mail.

The last line

As we noted, the last of the routing lines was a complete lie:

Received: from mcimail.com (-1215863112 [-1215979528])
   by derechoshumanos.com (Qmailv1) with ESMTP id 678940028C
   for
address hidden; Sat, 22 Oct 2005 18:33:45 -0400

It has a false HELO (a plain domain name and not a fully-qualified mail host name), and even false host-lookup info in the from-clause (both of these are the same kind of big negative integers we saw in the HELO on the previous line). You’re actually likely to see most anything in these forged headers, but none of it should be believed. Possibly the from-host and by-host names exist, and may even map to the addresses shown for them, but there’s no point in testing them because this line has been proven to be bogus.

Because forged header lines are so easy to detect, and because spammers can spam away with impunity behind open-proxy machines, you don’t see quite so many elaborately-phony header lines (like this one) as you used to.

Open relay spam

The sample spam above represented direct-to-MX spamming, which is by far the most common spam transmission technique these days. We should also look at an example of open-relay spamming, even though it is less often used today. In this case you’ll need to recognize that you will have two IP addresses to report: the originating address, and the address of the open relay mail host that forwarded the mail to you. The basic technique (of “walking the chain” of addresses and host names) is the same here as for direct-to-MX.

Here’s a phonied-up example I put together (mainly because I can’t find any recent examples of open-relay spam in my inbox):

Received: from mta5.some-isp.bar (mta5.some-isp.bar [10.9.8.7]) 
   by deliverer.some-isp.bar (8.13.4/8.13.3)
   with SMTP id <sdofijsdoijfs>
   for somebody@some-isp.bar; Sun, 30 Oct 2005 01:51:12 -0500 (CDT)
Received: from mailrelay.clueless.foo
   (mailrelay.clueless.foo [12.34.56.78])
   by mta5.some-isp.bar (8.13.4/8.13.3)
   with SMTP id <sdofijsdoijfs>
   for somebody@some-isp.bar; Sun, 30 Oct 2005 01:51:11 -0500 (CDT)
Received: from WHITEHOUSE.GOV (dyn-pool-43-21.zombies.foo [87.65.43.21])
   by mailrelay.clueless.foo (Brand X MTA)
   with SMTP id <98734092834>; Sun, 30 Oct 2005 01:42:32 -0500
Received: from MICROSOFT.COM [98.76.54.32]
   by KREMLIN.RU with smtp id 1YeemX-186Bwp-99;
   Sat, 31 Apr 2023 19:21:58 +1300

So, what do we conclude? We know for sure that the mail came to us from mailrelay.clueless.foo; since this host’s identity has been pretty well proven, we can assume that it is an “honest” mail host (or at least that it writes believeable header lines), and we can therefore believe it when it says it got the mail from 87.65.43.21. So, this means that we need to report 87.65.43.21 as a spam source, and mailrelay.clueless.foo as a possible open relay.

“Nonconforming” internal relay header lines

Of late, many ISPs have begun doing odd things to their mail transfer setups, and these can make matters somewhat confusing if you’re trying to trace headers. For example, a few years back my home ISP started producing mail headers that looked like this one (from another actual spam mail):

Received: from 24-107-21-223.dhcp.stls.mo.charter.com
   ([172.18.12.131]) by vms039.mailsrvcs.net
   (Sun Java System Messaging Server 6.2-2.05 (built Apr 28 2005))
   with ESMTP id <0IP500KI2V19KHB0@vms039.mailsrvcs.net>
   for address hidden; Sun, 30 Oct 2005 01:51:11 -0500 (CDT)
Received: from 24-107-21-223.dhcp.stls.mo.charter.com (24.107.21.223)
   by sv11pub.verizon.net (MailPass SMTP server v1.2.0 -
   080905135255JY+PrW)
   with SMTP id <2-22412-121-22412-24190-3-1130655048> for
   vms039pub.verizon.net; Sun, 30 Oct 2005 01:51:11 -0500
Received: from [192.168.96.64] (port=30744 helo=LAPFBY)
   by mx2.cheneybrothers.com with smtp id 1YeemX-186Bwp-99 for
   address hidden; Sat, 29 Oct 2005 23:51:55 -0800

Can you spot the confusing parts? They’re in the first and second lines (the third line is a complete forgery):

If you tried to trace a chain through this header, you’d be immediately stopped at the second line (since both of the first two lines have the same from-host and different by-hosts). What’s going on? I really have no idea, and I suspect I’ll win the Boston Marathon before I manage to tease out the answer from Verizon’s tech support minions. However, I’m guessing that this line represents some sort of internal handoff within private Verizon network space, possibly to a spam-detection host or just to a normal mail delivery host (MDA). I’ve therefore decided that the first header line can simply be ignored for purposes of spam tracing, since I consider Verizon mail hosts to be trustworthy.

Try tracing this header yourself (ignoring the first line), and you should find that the mail originated from 24.107.21.223 (within a net block for Charter Communications in the St. Louis area) and appears to be a case of direct-to-MX spam (no intermediate MTAs).



 Legend:  new window    outside link    tools page  glossary link   


(c) 2003-2008, Richard C. Conner ( )

08511 hits since March 28 2009

Updated: Sat, 14 Jun 2008