«Trust in Email Begins with Authentication Issued by the Messaging Anti-Abuse Working Group (MAAWG) June 2008 Edited by Dave Crocker Brandenburg ...»
Trust in Email Begins with Authentication
Issued by the
Messaging Anti-Abuse Working Group (MAAWG)
Edited by Dave Crocker
The Internet’s growth allows us to interact with people all over the world. Unfortunately, some of
those people do not make good neighbors. Along with the effort to detect and filter the problematic
traffic they generate, there is a complementary effort to identify trustworthy participants. In security
technology parlance, the first seeks to identify Bad Actors whereas the second creates ways of distinguishing Good Actors. At its simplest, identifying Good Actors can be divided into two activities: A safe means of identifying a participant–such as an author or an operator of an email service–and then a useful means of assessing their trustworthiness. The first activity is called authentication and the second is usually called reputation assessment. This white paper considers the first step: authenticating the identity that asserts responsibility for an email. In it, recent developments in standardized authentication mechanisms are reviewed that have been tailored for use in email anti- abuse efforts.
This white paper provides background on authentication as a foundation for understanding current efforts to protect Internet mail. It then looks at the most popular mechanisms currently in use. The paper is intended for a general readership that has basic familiarity with Internet mail service. While this single document is unlikely to be the final word on the topic, MAAWG has striven to capture the current best practices and leading theories regarding email authentication. As a complement to enabling identification of Good Actors, authentication is expected to aid efforts in protecting business’ brands from forgery and phishing attacks. The Executive Summary provides a one-page overview that can be used independently.
The Promise of Authentication
Sender Policy Framework (SPF)
Sender IDentification Framework (Sender ID)
DomainKeys Identified Mail (DKIM)
MAAWG Messaging Anti-Abuse Working Group P.O. Box 29920 San Francisco, CA 94129-0920 www.MAAWG.org info@MAAWG.org Executive Summary The disruptive impact of spam and other email abuse has generated two lines of response by the email services industry. One focuses on detecting and filtering problem messages. A complementary, but quite different, response seeks a basis for trusting a message rather than for mistrusting it. Is someone trustworthy responsible for the message? This approach has three steps: 1) assign a person or organization’s name to the message – identification, 2) then verify that the use of this identifier is authorized and correct – authentication, and 3) finally determine the trustworthiness of the identity – reputation assessment. This paper discusses current industry efforts to satisfy the first two requirements.
Internet mail is extremely flexible in the types of identities that can be referenced. Most recipients only know about the From: header field, containing a displayable name of the message author and their email address. However email also can separately name the agent that posted the message for sending, the agent that receives return handling notices, or agents that handle the message at different stages of transmission.
Each of these is often referred to as a sender of the message, so the term sender can be ambiguous.
When authentication mechanisms are applied, both the originating and receiving systems are able to correctly and reliably validate who is accountable for the message. This is generally described as “knowing who the message came from.” When an authentication mechanism becomes widely popular, it opens the door to a variety of assessment products and services that can rely on it.
Three candidates have emerged for authenticating who is accountable for a message while it is in transit.
They cover two technical paradigms, use different identifiers, and have different limitations and flexibilities:
Sender Policy Framework (SPF) and Sender Identification Framework (Sender ID) use a path registration approach, whereas DomainKeys Identified Mail (DKIM) uses digital, cryptographic signing. A simple way to distinguish these two paradigms is that the one builds authentication into the email-handling infrastructure along the path that a message travels, whereas the other wraps authentication information into the message itself and is independent of the infrastructure. Other authentication technologies are used for email, but they do not deal with email transit accountability.
SPF uses the underlying network address (IP Address) of the email-sending neighbor that is closest to the validating server and the machine name (Domain Name) in the message’s return address or the Domain Name in the protocol transfer greeting between email handling hosts. It queries the Domain Name System (DNS) to perform a mapping between the name and the address. Hence, the IP Address of any email relay that might be a neighbor, for SPF evaluation, must be known beforehand and pre-registered in the DNS by the sender. SPF validation does not function as desired when a message is sent through a forwarding service or independent posting service. An enhancement, called Sender Rewriting Scheme (SRS), recruits intermediaries to modify the problem address and make it acceptable for SPF validation. SPF also permits the DNS mapping entry to contain assertions about mail that comes under SPF registration.
Sender ID is the same mechanism as SPF, except that it chooses from a different array of Domain Names, specifically the Purported Responsible Address (PRA) domain in a message’s From: or Sender: header fields, or its return address’ Domain.
DKIM uses digital, cryptographic signatures, attaching information to a new header field in the message. A DKIM signature can withstand minor message modifications without becoming invalid, including some that are made by forwarding services and mailing list software. Any Domain Name can be used for signing the message. As with SPF and Sender-ID, the queried entry in the DNS self-validates the name’s use.
Associating it with an existing identifier, such as the From: or Sender: header fields, is a separate step. An enhancement to DKIM will permit DNS publication of a Domain’s Signing Practices (SP) that is intended to aid assessment of the possible legitimacy of unsigned messages.
Trust in Email Begins with Authentication (June 2008) 2 Introduction There are two lines of industry response to protecting users against abusive email. One focuses on deciding that a message is from a Bad Actor; it is based on detecting and filtering out problem messages. It analyzes email sources, traffic patterns and content. Mail cannot cause problems for recipients if is not delivered to them. Although essential as a first line of defense, this approach is at best approximate. Filtering junk email can be prone to error, with false positives – legitimate email classified as junk – and false negatives – junk email classified as legitimate. The distressing yet inevitable result is that this approach produces an ever-escalating arms race of counter-techniques, locking the abuse and anti-abuse communities in a constant struggle. For example, as the anti-abuse community has gotten better at analyzing textual content, the advent of image spam has become a creative vector of attack to defeat such analysis. Some service providers have become extremely proficient in protecting users, but the cost is very high and the protection is very fragile. Few providers can ensure this level of protection. With email abuse estimated to be as high as 90 percent of Internet messaging traffic, it has become critical to find ways of restoring user confidence in email.
The second approach, which complements problem email detection, is to find a basis for trusting the message rather than a basis for mistrusting it. A message determined to be from a Good Actor does not need to be subject to the stringent analysis that would be applied to mail from an unknown source. The usual method of accomplishing this is to associate a confirmable identity with the message and to obtain an assessment about that identity. In other words, provide the recipient with a means of deciding that someone trustworthy is responsible for the message. For example, provide information answering the questions: who is the author and are they known to write legitimate messages?
Fig ure 1 : T he Asse ssme nt Fr amew ork
As shown in Figure 1, if we are to trust the claimed identity of a message sender, then we first need a mechanism that validates the identity’s use. This is called authentication. Only when we know that the identity is valid can we assess its trustworthiness. Without authentication, we could know a person’s or an organization’s reputation information, but could not be certain that they are indeed responsible for this message.
Authentication-based assessment follows these three steps:
1. Assign an identifier to a message, which refers to an identity – the name of a person or organization.
2. Provide a means of validating the use of that identifier – an authentic identity that is authorized, for this use.
3. Assess the reputation or trustworthiness of that identity.
This paper focuses on the first two steps, which produces an authenticated reference to an identity. The technologies discussed here describe how to assign an identifier to the message. The second step checks Trust in Email Begins with Authentication (June 2008) 3 whether this use of the identifer is permitted, and the third step assesses the trustworthiness (reputation) of the responsible person or organization (identity) for that use.
Outside of Internet technology, an analogy of the distinction between the processes of identifier verification described in the first two steps above and assessing reputation in the third step can be illustrated by the role a driver’s license plays in commercial transactions. A license contains a name that refers to a person; the name is an identifier. A license is generally accepted as validating a person’s identity, but it does not indicate the person is approved for a loan. Rather, the person’s credit score is the measure of his reputation that is used to determine credit-worthiness.
Moreover, a person or organization can have multiple reputations, each within its own context. Reputation analysis begins by determining the authorized scope for using an identifier and then assesses the associated identity within the current context. For example, in the case of the loan, the context analysis might start with whether the person is applying for a personal loan or a loan for their company. In the case of email reputation assessment, the starting point for the context analysis might be whether the individual is sending a personal message or one purporting to represent their company.
Email The global architecture for Internet mail is shown in Figure 2. There is a simple split between the user world, in the form of Mail User Agents (MUA), and the transmission world, in the form of the Mail Handling Service (MHS) composed of Mail Transfer Agents (MTA). An MTA that sends an organization’s mail directly into the public Internet, or receives mail from it, is called a Boundary, or Border, MTA. An MTA that sends messages handles Outbound mail and one that receives messages handles Inbound mail.
The MHS is responsible for accepting a message from one user, the author, and then delivering it to one or more other users, the recipients. This creates a user experience of apparently-direct MUA-to-MUA exchange, without a user having to be cognizant of the intervening steps performed by the MHS infrastructure. The first component of the MHS is called the Mail Submission Agent (MSA) and the last is called the Mail Delivery Agent (MDA).
Internet mail is often referred to by one of its standards, SMTP (Simple Mail Transfer Protocol), but is really a collection of specifications, with the core being RFC 2821 (SMTP) for transfer, RFC 2822 for the message header and RFC 2045 (MIME) for the message body and attachments. A recipient retrieves a delivered message using POP (Post Office Protocol) or IMAP (Internet Message Access Protocol) or proprietary protocols such as Microsoft’s MAPI.
Internet mail standards specify the meaning of the identities such as author and sender that are used in sending a message but does not mandate or enforce particular choices for them. Although some software and some originating organizations choose to constrain the use of identifiers, the reality is that the underlying Internet standards permit an author to Trust in Email Begins with Authentication (June 2008) 4 Fig ure 2 : Interne t mail Service Archi tecture claim to be anyone. While this is the same as for telephone and postal communications, we have developed adequate techniques such as … and … for dealing with abuse issues in these older communications mechanisms. Because these techniques will not work for the Internet, we need new ones that are designed for our new communications mechanisms.
Many Role s i n Ha ndli ng a Mes sa ge
The concept of an (online) identity has a rich and confusing history. One source of confusion is the difference between a thing itself, versus the name of the thing. In society, a person or an organization can be referred to by one or more names, such as a legal name or a nickname. Both refer to the same person or organization, but with different labels. In technical parlance, the term identity refers to the person or organization itself. The term identifier refers to a label that is used for that identity.
In the case of a driver’s license, there is one identifier – the name on the license – and one identity – the person to whom the identifier refers. A bank might use a driver’s license to confirm identity, but it needs additional information, such as a credit score, to determine a person’s financial reputation and credit worthiness. The better the reputation, the more financial and other privileges the driver enjoys, such as lower interest rates.