Email

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

Electronic mail (abbreviated "e-mail", "E-mail", or "email") is a method of composing, sending, storing, and receiving messages over electronic communication systems. The term "email" (as a noun or verb) applies both to the Internet email system based on the Simple Mail Transfer Protocol (SMTP) and to intranet systems on an internal network allowing users within one organization to email each other.

How Internet e-mail works

The process begins when the user composes a message using her mail user agent (MUA), either by initiating a new message or replying to one she has received from another user.

  1. She types in the e-mail address of the recipient(s). The format used for e-mail addresses is known as a Fully Qualified Domain Address (FQDA), where the address is structured akin to user@domain.ext, where "user" is the username of the sender or recipient, "domain" is the domain name where the user's MUA is hosted, and "ext" is that domain's extension (e.g., .com or .org). For the purpose of this example, assume that the sender's e-mail address is user1@domain1.ext and the recipient's e-mail address is user2@domain2.ext.
  2. She then types in an optional subject line, which is a summary or headline of the contents of the e-mail.
  3. Upon writing the main body of the e-mail and adding optional attachments, the user clicks "send".

Afterwards, the MUA processes the information without user intervention as follows:

  1. Her MUA formats the message in Internet e-mail format and uses the Simple Mail Transfer Protocol (SMTP) to send the message to the local mail transfer agent (MTA), in this example smtp.domain1.ext, run by the user's Internet Service Provider (ISP).
  2. The MTA looks at the destination address provided in the SMTP protocol (not from the message header), user2@domain2.ext as listed above. The MTA looks up this domain name in the Domain Name System to find the mail exchange servers accepting messages for that domain.
  3. The DNS of the source makes a DNS request, which the DNS for domain2.ext responds to with an MX record listing the mail exchange servers for that domain, in this case mx.domain2.ext, a server run by the host's ISP.
  4. smtp.domain1.ext sends the message to mx.domain2.ext using SMTP, which delivers it to the inbox of recipient's MUA.
  5. The recipient presses the "get mail" button in his MUA, which picks up the message using the Post Office Protocol (POP3).

Other usage and variants

The above sequence of events applies to the majority of e-mail users. However, there are many alternative possibilities and complications to the e-mail system:

  • One of the users may use a client connected to a corporate e-mail system, such as IBM's Lotus Notes or Microsoft's Exchange. These systems often have their own internal e-mail format and their clients typically communicate with the e-mail server using a vendor-specific, proprietary protocol. The server sends or receives e-mail via the Internet through the product's Internet mail gateway which also does any necessary reformatting.
  • In the above example, if both users work for the same company, the entire transaction may happen completely within a single corporate e-mail system. This eliminates or simplifies steps 2 and 3 of the above example.
  • In some proprietary e-mail systems, such as America Online's (AOL) e-mail system, users who both use the same e-mail system do not need to enter the full address, only the other user's user name.
  • One or both users may not have a MUA on her computer but instead may connect to a webmail service.
  • The sender's computer may run its own MTA, so avoiding the SMTP transfer at step 1.
  • The recipient may pick up his e-mail in many ways, for example using the Internet Message Access Protocol, by logging into mx.domain2.ext and reading it directly, or by using a webmail service.
  • Domains usually have several mail exchange servers so that they can continue to accept mail when the main mail exchange server is not available.
  • E-mails are not secure if e-mail encryption is not used correctly.

Open mail relays

It used to be the case that many MTAs would accept messages for any recipient on the Internet and do their best to deliver them. Such MTAs are called open mail relays. This was important in the early days of the Internet when network connections were unreliable. If an MTA couldn't reach the destination, it could at least deliver it to a relay that was closer to the destination. The relay would have a better chance of delivering the message at a later time. However, this mechanism proved to be exploitable by people sending unsolicited bulk e-mail and as a consequence very few modern MTAs are open mail relays, and many MTAs will not accept messages from open mail relays because such messages are very likely to be spam.

Plain Text/HTML MUAs

Both plain text and HTML are used by MUAs to compose and display e-mail. While text is certain to be read by all users without problems, many users feel that HTML-based e-mail has a higher aesthetic value, due to allowing users more freedom to format their e-mail for appearance. Advantages of HTML include the ability to implement inline links and images, set apart previous messages in block quotes, wrap naturally on any display, use emphasis such as underlines and italics, and change font styles. HTML e-mails often include an automatically-generated plain text copy as well, for compatibility reasons.

Internet e-mail format

The format of Internet e-mail messages is defined in RFC 2822 and a series of RFCs, RFC 2045 through RFC 2049, collectively called Multipurpose Internet Mail Extensions (MIME). Although as of 13 July 2005 (see [1]) RFC 2822 is technically a proposed IETF standard and the MIME RFCs are draft IETF standards, these documents are the de facto standards for the format of Internet e-mail. Prior to the introduction of RFC 2822 in 2001 the format described by RFC 822 was the de facto standard for Internet e-mail for nearly two decades; it is still the official IETF standard. The IETF reserved the numbers 2821 and 2822 for the updated versions of RFC 821 (SMTP) and RFC 822, honoring the extreme importance of these two RFCs. RFC 822 was published in 1982 and based on the earlier RFC 733.

Anatomy of an e-mail message

E-mail messages contain several parts, indicated by separate text boxes when composing an e-mail, or by line breaks in the contents of the e-mail.

Each header field has a name and a value. RFC 2822 specifies the precise syntax. Informally, the field name starts in the first character of a line, followed by a ":", followed by the value which is continued on non-null subsequent lines that have a space or tab as their first character. Field names and values are restricted to 7-bit ASCII characters. Non-ASCII values may be represented using MIME encoded words.

Note that the "To" field in the header is not necessarily related to the addresses to which the message is delivered. The actual delivery list is supplied in the SMTP protocol, not extracted from the header content. The "To" field is similar to the greeting at the top of a conventional letter which is delivered according to the address on the outer envelope. Also note that the "From" field does not have to be the real sender of the e-mail message. It is very easy to fake the "From" field and let a message seem to be from any mail address. It is possible to digitally sign e-mail, which is much harder to fake. Some Internet service providers do not relay e-mail claiming to come from a domain not hosted by them, but very few (if any) check to make sure that the person or even e-mail address named in the "From" field is the one associated with the connection. Some internet service providers apply e-mail authentication systems to e-mail being sent through their MTA to allow other MTAs to detect forged spam that might apparently appear to be from them.

IANA also maintains a list of standard header fields.

To/From Lines

Includes the e-mail addresses, and optionally the registered names, of both the sender and recipient of the message, separated by a single line break.

Subject Line

A brief summary of the contents of the message. The Subject line appears next to the name of the sender in the recipient's inbox, prior to opening the message, and is written in plaintext.

Date

The local time and date when the message was originally sent

Carbon copy (CC)

When the e-mail is intended for a specific recipient or recipients, but the sender wishes that the contents of the e-mail be made known to another user (e.g., the sender's boss in a work environment), the e-mail can be "CCed" to the user[s].

Blind carbon copy (BCC)

Similar to carbon copy above, but where the sender does not wish the recipient to know the identity of user[s] whose addresses are included in the BCC text box.

Many e-mail clients present "Bcc" (Blind carbon copy, recipients not visible in the "To" field) as a header field. Different protocols are used to deal with the "Bcc" field; at time the entire field is removed, and at times the field remains but the addresses therein are removed. Addresses added as "Bcc" are only added to the SMTP delivery list, and do not get included in the message data. There are differing impressions on the RFC 2822 Protocol pertaining to this subject.

Received

Tracking information generated by mail servers that have previously handled a message.

Content-Type

Information about how the message has to be displayed, usually a MIME type

Body

The main bulk of the e-mail message, either displayed as unstructured text or formatted by the user.

Signature

A block of text or image that is included in all e-mails from the sender. Many MUAs allow for a signature to be included in the user's preferences, while other users manually input their signature into the body text of each e-mail.

Saved message filename extension

Most, but not all, e-mail clients save individual messages as separate files, or allow users to do so. Different applications save e-mail files with different filename extensions.

.eml
This is the default e-mail extension for Mozilla Thunderbird and is used by Microsoft Outlook Express.
.emlx
Used by Apple Mail.
.msg
Used by Microsoft Office Outlook.


Origins of e-mail

E-mail predates the inception of the internet, and was in fact a crucial tool in creating the Internet. MIT first demonstrated the Compatible Time-Sharing System (CTSS) in 1961.[1] It allowed multiple users to log into the IBM 7094[2] from remote dial-up terminals, and to store files online on disk. This new ability encouraged users to share information in new ways. E-mail started in 1965 as a way for multiple users of a time-sharing mainframe computer to communicate. Although the exact history is murky, among the first systems to have such a facility were SDC's Q32 and MIT's CTSS.

E-mail was quickly extended to become network e-mail, allowing users to pass messages between different computers. The messages could be transferred between users on different computers by at least 1966 (it is possible the SAGE system had something similar some time before).

The ARPANET computer network made a large contribution to the development of e-mail. There is one report [2] which indicates experimental inter-system e-mail transfers on it shortly after its creation, in 1969. Ray Tomlinson initiated the use of the @ sign to separate the names of the user and their machine in 1971 [3]. The ARPANET significantly increased the popularity of e-mail, and it became the killer app of the ARPANET.

E-mail content encoding

E-mail was originally designed for 7-bit ASCII. Much e-mail software is 8-bit clean but must assume it will be communicating with 7-bit servers and mail readers. The MIME standard introduced charset specifiers and two content transfer encodings to encode 8 bit data for transmission: quoted printable for mostly 7 bit content with a few characters outside that range and base64 for arbitrary binary data. The 8BITMIME extension was introduced to allow transmission of mail without the need for these encodings but many mail transport agents still don't support it fully. For international character sets, Unicode is growing in popularity.

Common Problems in Modern E-mail

Despite its usefulness, many problems exist in modern applications of e-mail. Hackers and network engineers endeavor to stay abreast of security flaws and bugs in the different protocols and software that govern the exchange of e-mail.

Spam

E-mail spam is unsolicited commercial e-mail advertisement. Because of the very low cost of sending e-mail, spammers can send hundreds of millions of e-mail messages each day over an inexpensive Internet connection. Hundreds of active spammers sending this volume of mail results in information overload for many computer users who receive tens or even hundreds of junk messages each day.

A number of anti-spam techniques mitigate the impact of spam. In the United States of America, U.S. Congress has also passed a law, the Can Spam Act of 2003, attempting to regulate such e-mail. Australia also has very strict spam laws restricting the sending of spam from an Australian ISP (http://www.aph.gov.au/library/pubs/bd/2003-04/04bd045.pdf), but its impact has been minimal since most spam comes from regimes that seem reluctant to regulate the sending of spam.

E-mail worms

E-mail worms use e-mail as a way of replicating themselves into vulnerable computers. Although the first e-mail worm affected UNIX computers, the problem is most common today on the more popular Microsoft Windows operating system.

Phishing

An act by a phisher to impersonate an official figure via e-mail message, in an attempt to trick the reader into providing sensitive information, such as bank account information or passwords.

E-mail scams

E-mail messages designed to trick the reader into believing they will receive a large sum of money legally in exchange for their financial participation in some simple exercise. The most common form of e-mail scam is known as Nigerian e-mail scams, since the main bulk of the writers claim to be important figures in Nigeria.

Privacy problems regarding e-mail

E-mail privacy, without some security precautions, can be compromised because:

  • E-mail messages are generally not encrypted.
  • E-mail messages have to go through intermediate computers before reaching their destination, meaning it is relatively easy for others to intercept and read messages.
  • Many Internet Service Providers (ISP) store copies of e-mail messages on their mail servers before they are delivered. The backups of these can remain up to several months on their server, even if the user deletes them from his mailbox.
  • The headers and other information in the e-mail can often identify the sender and/or recipient, preventing anonymous communication.
  • E-mail passwords might be intercepted during sign-in.

There are cryptography applications that can serve as a remedy to one or more of the above. Some such applications can be used to encrypt traffic from the user machine to a safer network, while other applications (notably PGP) can be used for end-to-end message encryption, and still others can be used to encrypt communications for a single mail hop between the SMTP client and the SMTP server. One may use encrypted authentication schemes SASL to help prevent password interception.

Flaming

Many observers bemoan the rise of flaming in written communications. Flaming occurs when one user, usually upset at another user, e-mails the second user an angry and/or antagonistic message. Flaming is assumed to be more common today because of the ease and impersonality of e-mail communications: confrontations in person or over the phone will often be personal-enough to encourage conversants to "hold their tongue," and typing an unhappy message to another is far easier than seeking that other out and confronting him/her directly.



References

Notes

  1. "CTSS, Compatible Time-Sharing System" (September 4, 2006), University of South Alabama, web: USA-CTSS.
  2. Tom Van Vleck, "The IBM 7094 and CTSS" (September 10, 2004), Multicians.org (Multics), web: Multicians-7094.

Bibliography