Callback verification

Callback verification, also known as callout verification or Sender Address Verification, is a technique used by SMTP software in order to validate e-mail addresses. The most common target of verification is the sender address from the message envelope (the address specified during the SMTP dialogue as "MAIL FROM"). It is mostly used as an anti-spam measure.

The three hosts involved in an SMTP callout verification. If the address is not forged, the sender and the MX server may coincide.

Purpose

Since a large percentage of e-mail spam is generated from forged sender ("mfrom") addresses[citation needed], some spam can be detected by checking whether forging resulted in an invalid address, using this method.

A related technique is "call forwards", in which a secondary or firewall mail exchanger can verify recipients at the primary mail exchanger for the domain in order to decide whether the address is deliverable.

Process

The receiving mail server verifies the sender address, by verifying both parts of the sender address - the domain name (part after the @ character), and the local part (part before the @ character). The first step is to establish a successful SMTP connection to the mail exchanger for the sender address. The mail exchanger is found by looking up the MX records in the domain's DNS zone. The second step is to query the exchanger, and make sure that it accepts the address as a valid one. This is done in the same way as sending an email to the address, however the process is stopped after the mail exchanger accepts or rejects the recipient address. These are the same steps the receiving mail server would take to bounce mail back to the sender, however in this case no mail is sent. The SMTP commands sent out are:

HELO verifier host name
MAIL FROM:<>
RCPT TO:<the address to be tested>
QUIT

Equivalently, the MAIL FROM and RCPT TO commands can be replaced by the VRFY command, however the VRFY command is not required to be supported and is usually disabled in modern MTAs.

Both of these techniques are technically compliant with the relevant SMTP RFCs (RFC 5321), however RFC 2505 (a Best Current Practice) recommends, by default, disabling the VRFY command to prevent directory harvest attacks. (One widespread interpretation implies that the MAIL FROM/RCPT TO pair of commands should also respond the same way, but this is not stated by the RFCs.)

Disadvantages

Callback verification is abusive[1] and can lead to blocklisting of the server, the domain or the IP block.

If the admin of the mail server has disabled VRFY, then using RCPT TO as a way to circumvent that is an unauthorized access and an attempt to evade access controls. As such, the operator is vulnerable to civil or criminal prosecution, depending on the jurisdiction.

Limitations

The documentation for both postfix and exim caution against the use[2][3] of this technique and mention many limitations to SMTP callbacks. In particular, there are many situations where it is either ineffective or causes problems to the systems that receive the callbacks.

  • Some regular mail exchangers do not give useful results to callbacks:
    • Servers that reject all bounce mails (contrary to the RFC 1123, a part of STD 3[4]). To work around this problem, postfix, for example, uses either the local postmaster address or an address of "double-bounce" in the MAIL FROM part of the callout. This workaround, however, has two problems: first, it can cause a verification loop; secondly it will fail if Bounce Address Tag Validation is used to reduce backscatter.[1] So, this work around should not be used. Callback verification can still work if rejecting all bounces happens at the DATA stage instead of the earlier MAIL FROM stage, while rejecting invalid e-mail addresses remains at the RCPT TO stage instead of also being moved to the DATA stage.[2][3]
    • Servers that accept all e-mail address at RCPT TO stage but reject invalid ones at DATA stage. This is commonly done in order to prevent directory harvest attacks and will, by design, give no information about whether an e-mail address is valid and thus prevent callback verification from working.[2][1]
    • Servers that accept all mails during the SMTP dialogue (and generate their own bounces later).[2] This problem can be alleviated by testing a random non-existent address as well as the desired address (if the test succeeds, further verification is useless).
    • Servers that implement catch-all e-mail will, by definition, consider all e-mail addresses to be valid and accept them. Like systems that accept-then-bounce, a random non-existent address can detect this.
  • The callback process can cause delays in delivery because the mail server where an address is verified may use slow anti-spam techniques, including "greet delays" (causing a connection delay) and greylisting (causing a verification deferral).[2]
  • If the system being called back to uses greylisting the callback may return no useful information until the greylisting time has expired. Greylisting works by returning a "temporary failure" (a 4xx response code) when it sees an unfamiliar MAIL FROM/RCPT TO pair of email addresses. A greylisting system may not give a "permanent failure" (a 5xx response code) when given an invalid e-mail address for the RCPT TO, and may instead continue to return a 4xx response code.[5]
  • Some e-mail may be legitimate but not have a valid "envelope from" address due to user error or just misconfiguration. The positive aspect is that the verification process will usually cause an outright rejection, so if the sender was not a spammer but a real user, they will be notified of the problem.
  • If a server receives a lot of spam it may do a lot of callbacks. If those addresses are invalid or spamtrap, the server will look very similar to a spammer who is doing a dictionary attack to harvest addresses. This in turn might get the server blacklisted elsewhere.[2][1][6]
  • Some administrators consider any callback verification to be unsolicited bulk e-mail (UBE), and may block the originating SMTP client, report it as spam, or add it to DNSBLs, even if the backscatter is of low volume.
  • Every callback places an unasked for burden on the system being called back to, with very few effective ways for that system to avoid the burden. In extreme cases, if a spammer abuses the same sender address and uses it at a sufficiently diverse set of receiving MXs, all of which use this method, they might all try the callback, overloading the MX for the forged address with requests (effectively a Distributed Denial of Service attack).[1]
  • Callback verification has no effect if spammers spoof real email addresses[1][7] or use the null bounce address.

Some of these problems are caused by originating systems violating or stretching the limits of RFCs; verification problems are only reflecting these problems back to the senders, like unintentionally used invalid addresses, rejection of the null sender, or greylisting (where, for example, the delay caused by the verifying recipient is closely related to the delay caused by the originator). In many cases this in turn helps originator system to detect the problems, and fix them (like unintentionally not being able to receive valid bounces).

Several of the above problems are reduced by caching of verification results. In particular, systems that give no useful information (not rejecting at the RCPT TO time, have catch-all e-email, etc.) can be remembered and no future call backs to those systems need to be made. Also, results (positive or negative) for specific e-mail addresses can be remembered. MTAs like Exim have caching built in.[3]

References