A quick overview of SMTP
SMTP is the Internet protocol used to transfer electronic mail between computers, much like HTTP is the Internet protocol used to transfer web pages between computers. Like HTTP, there has been more than one generation of SMTP; the second generation is called ESMTP (for Extended SMTP), but the differences are not important for this introduction.
This attempts to be a quick overview of SMTP and related concepts, explaining enough of how it works so that the reader can follow reasonable technical discussions.
In SMTP (and in the rest of this discussion), the client is the computer that is sending email, and the server is the computer that is receiving it. Thus we say SMTP clients (or SMTP senders) send email to SMTP servers, although the machines involved may both be servers in the general sense; for example an ISP's mail server sending email to the SMTP servers for utoronto.ca.
The envelope versus the letter
Just like physical letters, SMTP email has two different sets of address information: the envelope headers, like the addresses on the outside of an envelope, which are used by mail transport software to route and deliver the email, and the normal headers, which are part of the mail message and which are only read and interpreted by the user and his software, just like the address attached to a salutation at the start of a physical letter. Unlike the post office, SMTP usually throws away most of the envelope before it hands the message to the user, so many users are not aware of the envelope headers.
In fact, SMTP never looks at the message headers at all; as far as SMTP is concerned, the email message (headers and all) is just one big blob that it shuttles around. Many SMTP clients are perfectly happy to deliver email with badly broken or entirely nonexistent message headers.
The SMTP protocol
Like many Internet protocols, SMTP operates by sending lines of text back and forth between the client and the server. The client sends commands and eventually the email message, and the server sends back responses to tell the client if the server accepted the command or if something went wrong.
Server responses always come in a special format: three digits, a space (or a dash), and then some free-format text (in error messages, this is usually intended for users to read; otherwise it is generally just noise). If there is a dash after the third digit instead of a space, further lines of response follow; otherwise, this is the last line. The only really important thing about the response is the first digit, like so:
|Code ||Meaning |
|2xx ||everything is fine, go on |
|4xx ||temporary problem, try again later |
|5xx ||permanent error, give up |
Errors can happen at any time, so at any response a server can send a temporary or a permanent error instead of the go ahead indication the client was expecting. A proper client must be able to cope with this, retrying temporary failures (but not too soon or too often) and giving up gracefully on permanent failures. (Tragically, there are improper clients out there in the world.)
A SMTP conversation between the client and the server goes in stages, each one initiated by the client doing something. A typical conversation will look like:
|Client does: ||Server normally responds with: |
|Connects to the server ||220 Helo there |
|HELO client-hostname ||250 Pleased to meet you |
|MAIL FROM:<Sender address> ||250 OK |
|RCPT TO:<Recipient address> |
(May be repeated)
|250 OK |
|DATA ||354 Start mail input; end with <CRLF>.<CRLF> |
|Sends the actual email message ||(Nothing, it's waiting for the . that ends the message) |
|. ||250 OK, accepted for delivery |
At this point the email message has been sent. The client can now disconnect with a QUIT command, or it can send another email message by starting with the MAIL FROM step again (optionally sending a RSET command first).
The sender address is the email address that will receive email about delivery problems (mailing lists change this but not the From: email header so that they, and not the people sending to them, get messages about delivery problems). A special null sender address (MAIL FROM:<>) is used to signal that no one cares and no bounce notifications should be sent. Null senders are used when sending bounce messages themselves, and sometimes at other times.
There can be multiple recipients of the same message on the same computer. So that the actual email message only has to be transfered once (saving bandwidth), there can be several RCPT TOs for a message. (There has to be at least one, just like there has to be a MAIL FROM.) The client has to keep track of which recipient addresses have problems, if any, and retry them later if necessary.
The envelope headers are the MAIL FROM and RCPT TO parts of the SMTP conversation. The envelope sender is the MAIL FROM address, and the envelope recipients are the RCPT TO addresses.
The client-hostname, the sender address, and the recipient addresses should all be fully qualified. A fully qualified host or domain name is one that anyone on the Internet could use to look up information, not a shortened name useful only on machines inside an organization; for example, server.example.com instead of just server. A fully qualified email address is an email address with a fully qualified host or domain name, not just an email login; for example, MAIL FROM:<firstname.lastname@example.org> instead of MAIL FROM:<postmaster> or MAIL FROM:<postmaster@server>. If the host or domain name is left off an email address, the SMTP server usually has no choice but to interpret it as an address on itself.
Email routing, or welcome to DNS
All of this is very well and good, but it doesn't tell us how a client machine with email to send to email@example.com decides which SMTP server to deliver it to. That is decided by looking various pieces of information up in the Domain Name System, DNS, which is another Internet protocol and system.
DNS exists to give out various sorts of information about names; you give it a name and what type of information you want, and it tries to give you back an answer. For our purposes (and simplifying a bit), there are three interesting types of information, conventionally called record types:
- NS records for a domain, which tell you what hostnames can give you further information about that domain and names inside that domain, such as MX records or A records.
- MX records for a name, which tell you what hostnames should accept SMTP email for user@name, and which order you should try them in.
- A records for a hostname, which give you the IP addresses associated with the host.
Reduced to its simplest form, a SMTP client with email to send to firstname.lastname@example.org looks up NS records until it finds the nameservers for example.com, then asks them for the MX record for example.com, and finally asks for A records to determine the IP addresses of the names in the MX record. If a name has no MX record but does have an A record, email is delivered straight to the IP addresses listed.
The phrase host or domain name means a name that has one or both of an A record (an IP address) or an MX record (a place of record to deliver mail to). Such names are valid as the name to the right of the @ in email addresses. Names with just NS records are not; postmaster@ca is a nonexistent email address, although .ca certainly has NS records.