The Problems With Email
21 October 2008
Text based headers
According to RFC 5322 header data must be formatted using the ASCII character set. The ASCII character set allows only English language characters, basic punctuation, numbers, and data formatting characters. This is certainly limiting, but these headers are only intended to make a document ready for transport and nothing more.
This is a dire problem for internationalization as only English language characters are allowed. This means that even many other Western European language characters are not allowed and certainly not characters from languages using completely different characters.
Text based formatting also significantly limits meta-data description and usability. Data, in all its various forms, in the modern age must be available for a variety of uses if a system is to be considered well designed or usable. In order for data to be so available it must be well described and strictly formatted in a method open to transformation or extensible processing. In modern email applications the data is processed in a well designed manner, but only after such applications have extracted that data and applied their own unique meta-data for processing. This is compensation and not a solution to a technology limitation as such processing or usage upon the data cannot be shared between various applications.
Security
For most software created in the early '80s, or earlier, security was a distant after thought. Email is certainly no exception, although there has been some improvement. RFC 5322, format, and RFC 5321, SMTP, were not created with any security in mind, which opened to the door to numerous problems. Assistive technologies have emerged to combat this inherent weakness, such as: MIME, PGP, and S/MIME. The problem is that these are assistive technologies and not requirements, and PGP and S/MIME each requires a certificate authority that does not exist for the open internet. The bottom line is that email is inherently insecure. Messages sent out over the open internet are rarely encrypted and email addresses can be easily spoofed.
No standard for describing or formatting of the email message body
Since there is no standard for describing the content of an email message there is no reason to expect the distant end to process that content correctly. Typically email client software authors perform magic so that many different formatting scenarios are accepted. This is absolutely necessary so that email is understandable as it was intended by its author; however this results in other problems. This is most evident when Microsoft Outlook 2003 attempted to support processing of standards compliant HTML and CSS, which is a problem in itself as explained below. Since allowing that formatting of content resulted in other problems Microsoft significantly reduced processing of HTML and CSS in its next version, Outlook 2007, since HTML and CSS are both non-standard in email. This caused many emails that appeared to work in one version of the software to fail in the next version. Unfortunately, the problem was not due so much to faulty implementation of either HTML or CSS, but the unforeseen problems of invoking formatting where it was never intended.
Document authors must be ready for the reality that since there is no standard method or format for processing of email content there should be no expectation that the message will exist as it was intended on the distant end. If an email message completely falls apart and is no longer understandable after it is replied to or forwarded for the fifth time it is nobody's fault and certainly not the fault of the numerous email clients and servers that have reformatting and overwritten its header data and formatting each time.
HTML in email
The message body of emails has no standard definition for formatting, structure, or presentation. People have attempted to fill this vacuum with HTML, which is very bad. This is bad because HTML is most often applied to email for the wrong reasons. HTML is solely intended to describe and structure human consumable content in web pages only, and not to specify presentation or beautification of that content. Presentation, however, is the primary reason HTML is forced into email. Since there is no standard for describing or presenting content in email and since HTML is not a presentation language it often fails. The result is that people often use flawed, hacked, or intentionally sloppy HTML code to make it work, which is still likely to fail if that same content is resent.
Although presentation is the most common abuse for forcing HTML into email it is not the most disruptive problem. Email has a radically different, and more complex, transmission scheme than does web pages. Web pages use the HTTP protocol for transferring data. This protocol allows a one-way transfer, special considerations and forms aside, of data from a server to a host. Email, however, transfers data between end points that are non-dominate hosts in a bi-directional method. This profoundly different transmission method means the header data supplied by HTML is irrelevant, if not harmful, to the email content it is attempting to describe.
