Network Working Group D. Eastlake 3rd Request for Comments: 3930 Motorola Laboratories Category: Informational October 2004 The Protocol versus Document Points of View in Computer Protocols Status of this Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2004).
AbstractThis document contrasts two points of view: the "document" point of view, where digital objects of interest are like pieces of paper written and viewed by people, and the "protocol" point of view where objects of interest are composite dynamic network messages. Although each point of view has a place, adherence to a document point of view can be damaging to protocol design. By understanding both points of view, conflicts between them may be clarified and reduced. 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Points of View . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1. The Basic Points of View . . . . . . . . . . . . . . . . 3 2.2. Questions of Meaning . . . . . . . . . . . . . . . . . . 3 2.2.1. Core Meaning . . . . . . . . . . . . . . . . . . 3 2.2.2. Adjunct Meaning. . . . . . . . . . . . . . . . . 4 2.3. Processing Models. . . . . . . . . . . . . . . . . . . . 5 2.3.1. Amount of Processing . . . . . . . . . . . . . . 5 2.3.2. Granularity of Processing. . . . . . . . . . . . 5 2.3.3. Extensibility of Processing. . . . . . . . . . . 6 2.4. Security and Canonicalization. . . . . . . . . . . . . . 6 2.4.1. Canonicalization . . . . . . . . . . . . . . . . 6 2.4.2. Digital Authentication . . . . . . . . . . . . . 8 2.4.3. Canonicalization and Digital Authentication. . . 8 2.4.4. Encryption . . . . . . . . . . . . . . . . . . . 9 2.5. Unique Internal Labels . . . . . . . . . . . . . . . . . 10 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4. Resolution of the Points of View . . . . . . . . . . . . . . . 11
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 12 6. Security Considerations. . . . . . . . . . . . . . . . . . . . 12 Informative References . . . . . . . . . . . . . . . . . . . . . . 12 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 15 section 2 below. Section 3 gives some examples. Section 4 tries to synthesize the views and give general design advice in areas that can reasonably be viewed either way.
CSS], MUST be considered part of the document. Sometimes it is forgotten that the "document" originates in a computer, may travel over, be processed in, and be stored in computer systems, and is viewed on a computer, and that such operations may involve transcoding, enveloping, or data reconstruction. PROTO: What is important are bits on the wire generated and consumed by well-defined computer protocol processes. No person ever sees the full messages as such; it is only viewed as a whole by geeks when debugging, and even then they only see some translated visible form. If one actually ever has to demonstrate something about such a message in a court or to a third party, there isn't any way to avoid having computer experts interpret it. Sometimes it is forgotten that pieces of such messages may end up being included in or influencing data displayed to a person.
PROTO: The "meaning" of a protocol message should be clear and unambiguous from the protocol specification. It is frequently defined in terms of the state machines of the sender and recipient processes and may have only the most remote connection with human volition. Such processes have additional context, and the message is usually only meaningful with that additional context. Adding any human-readable text that is not functionally required is silly. Consulting attorneys during design is a bad idea that complicates the protocol and could tie a design effort in knots.
RFC3852] and XML [RFC3275, XMLENC]. But there are almost always application specific questions, particularly the question of exactly what information needs to be authenticated or encrypted. Questions of exactly what needs to be secured and how to do so robustly are deeply entwined with canonicalization. They are also somewhat different for authentication and encryption, as discussed below. ASCII], ASN.1 [ASN.1], or XML [XML], a
standard canonicalization (or canonicalizations) is specified or developed through practice. This leads to the design of applications that assume one of such standard canonicalizations, thus reducing the need for per-application canonicalization. (See also [RFC3076, RFC3741].) DOCUM: From the document point of view, canonicalization is suspect if not outright evil. After all, if you have a piece of paper with writing on it, any modification to "standardize" its format can be an unauthorized change in the original message as created by the "author", who is always visualized as a person. Digital signatures are like authenticating signatures or seals or time stamps on the bottom of the "piece of paper". They do not justify and should not depend on changes in the message appearing above them. Similarly, encryption is just putting the "piece of paper" in a vault that only certain people can open and does not justify any standardization or canonicalization of the message. PROTO: From the protocol point of view, canonicalization is simply a necessity. It is just a question of exactly what canonicalization or canonicalizations to apply to a pattern of bits that are calculated, processed, stored, communicated, and finally parsed and acted on. Most of these bits have never been seen and never will be seen by a person. In fact, many of the parts of the message will be artifacts of encoding, protocol structure, and computer representation rather than anything intended for a person to see. Perhaps in theory, the "original", idiosyncratic form of any digitally signed part could be conveyed unchanged through the computer process, storage, and communications channels that implement the protocol and could be usefully signed in that form. But in practical systems of any complexity, this is unreasonably difficult, at least for most parts of messages. And if it were possible, it would be virtually useless, because to authenticate messages you would still have to determine their equivalence with the preserved original form. Thus, signed data must be canonicalized as part of signing and verification to compensate for insignificant changes made in processing, storage, and communication. Even if, miraculously, an initial system design avoids all cases of signed message reconstruction based on processed data or re-encoding based on character set or line ending or capitalization or numeric representation or time zones or whatever, later protocol revisions and extensions are certain to require such reconstruction and/or re-encoding eventually. If such "insignificant" changes are not ameliorated by canonicalization, signatures won't work, as discussed in more detail in 2.4.3 below.
XForms]). Since the worry is always about human third parties and viewing the document in isolation, those who are document oriented always want "digital signature" (asymmetric key) authentication, with its characteristics of "non-repudiability", etc. As a result, they reject secret key based message authentication codes, which provide the verifier with the capability of forging an authentication code, as useless. (See any standard reference on the subject for the usual meaning of these terms.) From their point of view, you have a piece of paper or form which a person signs. Sometimes a signature covers only part of a form, but that's usually because a signature can only cover data that is already there. And normally at least one signature covers the "whole" document/form. Thus the document oriented want to be able to insert digital signatures into documents without changing the document type and even "inside" the data being signed, which requires a mechanism to skip the signature so that it does not try to sign itself. PROTO: From a protocol point of view, the right kind of authentication to use, whether "digital signature" or symmetric keyed authentication code (or biometric or whatever), is just another engineering decision affected by questions of efficiency, desired security model, etc. Furthermore, the concept of signing a "whole" message seems very peculiar (unless it is a copy being saved for archival purposes, in which case you might be signing a whole archive at once anyway). Typical messages are made up of various pieces with various destinations, sources, and security requirements. Furthermore, there are common fields that it is rarely useful to sign because they change as the message is communicated and processed. Examples include hop counts, routing history, and local forwarding tags.
(2) the sometimes difficult and tricky work of selecting or designing an appropriate canonicalization or canonicalizations to be used as part of authentication generation and verification, producing robust and useful authentication, or (3) "too much canonicalization" and having insecure authentication, useless because it still verifies even when significant changes are made in the signed data. The only useful option above is number 2. RFC2045], and decoding at the destination, are always incorporated to protect or "armor" the encrypted data. Although the application of canonicalization is more obvious with digital signatures, it may also apply to encryption, particularly encryption of parts of a message. Sometimes elements of the environment where the plain text data is found may affect its interpretation. For example, interpretation can be affected by the character encoding or bindings of dummy symbols. When the data is decrypted, it may be into an environment with a different character encoding or dummy symbol bindings. With a plain text message part, it is usually clear which of these environmental elements need to be incorporated in or conveyed with the message. But an encrypted message part is opaque. Thus some canonical representation that incorporates such environmental factors may be needed. DOCUM: Encryption of the entire document is usually what is considered. Because signatures are always thought of as human assent, people with a document point of view tend to vehemently assert that encrypted data should never be signed unless the plain text of it is known. PROTO: Messages are complex composite multi-level structures, some pieces of which are forwarded multiple hops. Thus the design question is what fields should be encrypted by what techniques to what destination or destinations and with what canonicalization.
It sometimes makes perfect sense to sign encrypted data you don't understand; for example, the signature could just be for integrity protection or for use as a time stamp, as specified in the protocol. XML] appears to have been thought of this way, although it can be used in other ways. PROTO: From a protocol point of view, unique internal labels look very different than they do from a document point of view. Since this point of view assumes that pieces of different protocol messages will later be combined in a variety of ways, previously unique labels can conflict. There are really only three possibilities if such tags are needed, as follows: (1) Have a system for dynamically rewriting such tags to maintain uniqueness. This is usually a disaster, as it (a) invalidates any stored copies of the tags that are not rewritten, and it is usually impossible to be sure there aren't more copies lurking somewhere you failed to update, and (b) invalidates digital signatures that cover a changed tag. (2) Use some form of hierarchical qualified tags. Thus the total tag can remain unique even if a part is moved, because its qualification changes. This avoids the digital signature problems described above. But it destroys the concept of a globally-unique anchor embedded in and moving with the data. And stored tags may still be invalidated by data moves. Nevertheless, within the scope of a particular carefully designed protocol, such as IOTP [RFC2801], this can work. (3) Construct a lengthy globally-unique tag string. This can be done successfully by using a good enough random number generator and big enough random tags (perhaps about 24 characters) sequentially, as in the way email messages IDs are created [RFC2822].
Thus, from a protocol point of view, such tags are difficult but if they are needed, choice 3 works best. RFC793], IPSEC [RFC2411], SMTP [RFC2821], and IOTP [RFC2801, RFC2802]. The eXtensible Markup Language [XML] is an example of something that can easily be viewed both ways and where the best results frequently require attention to both the document and the protocol points of view. Computerized court documents, human-to-human email, and the X.509v3 Certificate [X509v3], particularly the X509v3 policy portion, are examples primarily designed from the document point of view.
On the other hand, the document point of view is hard to stretch to encompass the protocol case. From a strict piece of paper perspective, canonicalization is wrong; inclusion of human language policy text within every significant object and a semantic tag with every adjunct should be mandatory; and so on. Objects designed in this way are rarely suitable for protocol use, as they tend to be improperly structured to accommodate hierarchy and complexity, inefficient (due to unnecessary text and self-documenting inclusions), and insecure (due to brittle signatures). Thus, to produce usable protocols, it is best to start with the protocol point of view and add document point of view items as necessary to achieve consensus. Sections 2.1 and 2.2, and warns of the security defects in the Document view. Most of these security considerations appear in Section 2.4 but they are also touched on elsewhere in Section 2 which should be read in its entirety. [ASCII] "USA Standard Code for Information Interchange", X3.4, American National Standards Institute: New York, 1968. [ASN.1] ITU-T Recommendation X.680 (1997) | ISO/IEC 8824-1:1998, "Information Technology - Abstract Syntax Notation One (ASN.1): Specification of Basic Notation". ITU-T Recommendation X.690 (1997) | ISO/IEC 8825-1:1998, "Information Technology - ASN.1 Encoding Rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER)". <http://www.itu.int/ITU- T/studygroups/com17/languages/index.html>.
[CSS] "Cascading Style Sheets, level 2 revision 1 CSS 2.1 Specification", B. Bos, T. Gelik, I. Hickson, H. Lie, W3C Candidate Recommendation, 25 February 2004. <http://www.w3.org/TR/CSS21> [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, September 1981. [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC2411] Thayer, R., Doraswamy, N., and R. Glenn, "IP Security Document Roadmap", RFC 2411, November 1998. [RFC3852] Housley, R., "Cryptographic Message Syntax (CMS)", RFC 3852, July 2004. [RFC2801] Burdett, D., "Internet Open Trading Protocol - IOTP Version 1.0", RFC 2801, April 2000. [RFC2802] Davidson, K. and Y. Kawatsura, "Digital Signatures for the v1.0 Internet Open Trading Protocol (IOTP)", RFC 2802, April 2000. [RFC2821] Klensin, J., "Simple Mail Transfer Protocol", RFC 2821, April 2001. [RFC2822] Resnick, P., "Internet Message Format", RFC 2822, April 2001. [RFC3076] Boyer, J., "Canonical XML Version 1.0", RFC 3076, March 2001. [RFC3275] Eastlake 3rd, D., Reagle, J., and D. Solo, "(Extensible Markup Language) XML-Signature Syntax and Processing", RFC 3275, March 2002. [RFC3741] Berger, L., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Functional Description", RFC 3471, January 2003. [X509v3] "ITU-T Recommendation X.509 version 3 (1997), Information Technology - Open Systems Interconnection - The Directory Authentication Framework", ISO/IEC 9594- 8:1997.
[XForms] "XForms 1.0", M. Dubinko, L. Klotz, R. Merrick, T. Raman, W3C Recommendation 14 October 2003. <http://www.w3.org/TR/xforms/> [XML] "Extensible Markup Language (XML) 1.0 Recommendation (2nd Edition)". T. Bray, J. Paoli, C. M. Sperberg- McQueen, E. Maler, October 2000. <http://www.w3.org/TR/2000/REC-xml-20001006> [XMLENC] "XML Encryption Syntax and Processing", J. Reagle, D. Eastlake, December 2002. <http://www.w3.org/TR/2001/RED-xmlenc-core-20021210/>
Full Copyright Statement Copyright (C) The Internet Society (2004). This document is subject to the rights, licenses and restrictions contained in BCP 78, and at www.rfc-editor.org, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the ISOC's procedures with respect to rights in ISOC Documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- firstname.lastname@example.org. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.