So now you know how to find that one Received: line documenting how the spam arrived at your ISP from some other place on the InterNet. Here are two samples of that one line, one for spam that intruded into my Yahoo Mail account, and one that intruded into my ISP account:
Received: from 203.90.87.75 (EHLO bg.mx2.e-tapaal.com) (203.90.87.75) by mta555.mail.yahoo.com with SMTP; 21 Apr 2002 15:22:43 -0700 (PDT) Received: from workfromhomenewsletter.com ([63.230.24.146]) by mail.netmagic.net (8.11.6/8.11.6) with SMTP id g37JUXx09596 forThe highlighted IP number is the *only* part of that line that can be trusted to tell you where the spam came from, either where it directly came from, or where the last relay was before it arrived at your own ISP. Anything to the left of that number in that same Received line shows what the spammer claimed as host name, which may or may not be true, or reverse-DNS information, which while slightly true may change from day to day to try to evade catching the spammer. Anything to the right of that number is just local information about your own host, time of day, etc., all of which you can trust but none of which gives youi any useful info info about the spammer's location.; Sun, 7 Apr 2002 12:30:40 -0700
Notice how there are two different formats, where the IP number is within round-parentheses (format used by Yahoo) or within square-brackets within round-parentheses (format used by NetMagic). Your own ISP will use one of these two formats consistently, so you just need to look at your headers once to see what format your ISP uses for incoming e-mail from outside sources, then you know better what to look for the next time you get spam.
So-far you've gotten spam, decided to complain about it, looked in the header to find the key Received line added by your own ISP upon receipt of the e-mail, and copied down the IP number in that key Received line, so you know the IP number of the host which trespassed in your own ISP's SMTP server to litter your inbox with the spam, which in these two examples were 203.90.87.75 and 63.230.24.146 respectively. So next you need to find out, for those given IP numbers, where to complain. As the old pirate would say "Aye, there's the rub!!" At present there is no database or lookup engine anywhere on the net that will reliably map from those IP numbers to a complaint address on the ISP that owns that IP number. SpamCop has a lookup engine that tries to provide that information, but most of the addresses it gives out haven't been checked by anyone, and many of them don't work at all, so if you try to send e-mail there your e-mail bounces back undelivered. Perhaps the worst case is when the host named in that complaint address does exist, so your own ISP's mailer daemon tries to connect to a SMTP server there, but in fact there's no SMTP server on that host, so your own mail system tries for five days to connect there, never succeeds, sends you a "transient non-fatal errors" message after a few hours or two days, then sends a final non-delivery notice after the full five days. So all that time you may have thought you'd successfully complained, but in fact nobody has heard you, and now you have to try again to find some valid complaint address for that particular IP number where you are sure the spam came from. Meanwhile the spammer has had five days to spam without the spamming host's admin getting any complaints.
To alleviate this problem of bogus (non-working) complaint addresses given by SpamCop and other souces, I've tried to collect complaint address that actually work, so whenever I get a new spam from an address block I've already researched my new complaint will go immediately to the fully working complaint address I found before. But there are many hundreds of IP address blocks that have spammed, and it's just too much work for one person to track down all that information and make sure it's correct. Hence my new project: the Distributed CTW database, whereby each person takes responsibility for maintaining mappings from IPnumber for a range of numbers, typically all numbers whose high-order byte is the same as the volunteer's own ISP's IP number. For example, my own ISP's IP number starts with 198, so I plan to personally maintain the data for IP numbers that start with 198, here. Notice how it has DownLinks that tell where to complain about various 198.nnn/16 blocks, and in one case there's a DownLink pointing to a sample second-level node which I also plan to maintain. In this case the second-level node isn't very useful because it has entries for only one ISP, but it illustrates how second-level nodes fit into the overall system. (Note, for such cases where only a very few different IP address blocks within a single /16 block have spammed, it seems a bit of a waste to dedicate an entire file for the second-level node, as I did there, so perhaps collecting several micro-nodes into a single file, with an abbreviated format for each, just the DownLinks section for each, would be a better approach. I'm planning to set up a demo of that format sometime soon.) My CTW-198 toplevel node also has side-links which list other /8 blocks for which toplevel nodes currently exist, namely just one that I set up myself to provide an example of how all the toplevel nodes have SideLinks pointing at each other. Each toplevel node's SideLinks section also has direct CTW addresses for all /8 blocks that have single complaint addresses for the whole block hence don't need any toplevel CTW node to break down their address space into /16 blocks.
What am I asking you to do? You volunteer as follows: You send me e-mail wherein you tell me which toplevel node you'd like to maintain, i.e. you tell me the first byte of your IP number (or if that first byte is already taken, another first byte where you have inside contacts) which will be the first byte of the IP numbers for which you maintain the CTW (Complain To Whom) data. After I create a first-draft of that html file for your node, you copy it to your own Web space, correct any mistakes I made that you can spot quickly, and tell me the URL where you have it online. I'll verify it's accessible from here and that the format hasn't been messed up, and if it's OK then I'll link to it from my CTW-198 node so that others can find your node via mine. After that you just maintain it to include new sources of spam within your first-byte range, CTW addresses that were good but stop working (you gotta track down a new working address, upstream if necessary) for each in your range, and accept e-mail from mei and from other volunteers asking about any new sources of spam within your range that you don't have listed yet.
What services do I plan to provide to volunteers?