Network Working Group J. Klensin Request for Comments: 4290 December 2005 Category: Informational Suggested Practices for Registration of Internationalized Domain Names (IDN) Status of This Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2005). IESG Note This RFC is not a candidate for any level of Internet Standard. The IETF disclaims any knowledge of the fitness of this RFC for any purpose and notes that the decision to publish is not based on IETF review apart from IESG review for conflict with IETF work. The RFC Editor has chosen to publish this document at its discretion. See RFC 3932 for more information.
AbstractThis document explores the issues in the registration of internationalized domain names (IDNs). The basic IDN definition allows a very large number of possible characters in domain names, and this richness may lead to serious user confusion about similar- looking names. To avoid this confusion, the IDN registration process must impose rules that disallow some otherwise-valid name combinations. This document suggests a set of mechanisms that registries might use to define and implement such rules for a broad range of languages, including adaptation of methods developed for Chinese, Japanese, and Korean domain names.
1. Introduction ....................................................3 1.1. Background .................................................3 1.2. The Nature and Status of these Recommendations .............4 1.3. Terminology ................................................5 1.3.1. Languages and Scripts .................................5 1.3.2. Characters, Variants, Registrations, and Other Issues ................................................6 1.3.3. Confusion, Fraud, and Cybersquatting ..................9 1.4. A Review of the JET Guidelines .............................9 1.4.1. JET Model .............................................9 1.4.2. Reserved Names and Label Packages ....................10 1.5. Languages, Scripts, and Variants ..........................11 1.5.1. Languages versus Scripts .............................11 1.5.2. Variant Selection ....................................13 1.6. Variants are not a Universal Remedy .......................14 1.7. Reservations and Exclusions ...............................14 1.7.1. Sequence Exclusions for Valid Characters .............14 1.7.2. Character Pairing Issues .............................15 1.8. The Registration Bundle ...................................15 1.8.1. Definitions and Structure ............................15 1.8.2. Application of the Registration Bundle ...............16 2. Some Implications of This Approach .............................17 3. Possible Modifications of the JET Model ........................18 4. Conclusions and Recommendations About the General Approach .....18 5. A Model Table Format ...........................................19 6. A Model Label Registration Procedure: "CreateBundle" ...........20 6.1. Description of the CreateBundle Mechanism .................21 6.2. The "no-variants" Case ....................................22 6.3. CreateBundle and Nameprep Mapping .........................22 7. IANA Considerations ............................................23 8. Internationalization Considerations ............................24 9. Security Considerations ........................................24 10. Acknowledgements ..............................................25 11. Informative References ........................................26
RFC3490] defines the basic model for encoding non- ASCII strings in the DNS. Additional specifications [RFC3491] [RFC3492] define the mechanisms and tables needed to support IDNA. As work on these specifications neared completion, it became apparent that it would be desirable for registries to impose additional restrictions on the names that could actually be registered (e.g., see [IESG-IDN] and [ICANN-IDN]) to reduce potential confusion among characters that were similar in some way. This document explores these IDN (international domain name) registration issues and suggests a set of mechanisms that IDN registries might use. Registration restrictions are part of a long tradition. For example, while the original DNS specifications [RFC1035] permitted any string of octets in a DNS label, they also recommended the use of a much more restricted subset. This subset was derived from the much older "hostname" rules [RFC952] and defined by the "LDH" convention (for the three permitted types of characters: letters, digits, and the hyphen). Enforcement of this restricted subset in registrations was the responsibility of the registry or domain administrator. The definition of the subset was embedded in the DNS protocol itself, although some applications protocols, notably those concerned with electronic mail, did impose and enforce similar rules. If there are no constraints on registration in a zone, people can register characters that increase the risk of misunderstandings, cybersquatting, and other forms of confusion. A similar situation existed even before the introduction of IDNA, as exemplified by domain names such as example.com and examp1e.com (note that the latter domain contains the digit "1" instead of the letter "l"). For non-ASCII names (so-called "internationalized domain names" or "IDNs"), the problem is more complicated. In the earlier situation that led to the LDH (hostname) rules, all protocols, hosts, and DNS zones used ASCII exclusively in practice, so the LDH restriction could reasonably be applied uniformly across the Internet. Support for IDNs introduces a very large character repertoire, different geographical and political locations, and languages that require different collections of characters. The optimal registration restrictions are no longer a global matter; they may be different in different areas and, hence, in different DNS zones.
For some human writing systems, there are characters and/or strings that have equivalent or near-equivalent usages. If a name can be registered with such a character or string, the registry might want to automatically associate all of the names that have the same meaning with the registered name. The registry might also decide whether the names that are associated with, or generated by, one registration should, as a group or individually, go into the zone or should be blocked from registration by different parties. To date, the best-developed system for handling registration restrictions for IDNs is the JET Guidelines for Chinese, Japanese, and Korean [RFC3743], the so-called "CJK" languages. The JET Guidelines are limited to the CJK languages and, in particular, to their common script base. Those languages are also the best-known and most widely-used examples of writing systems constructed on "ideographic" or "pictographic" principles. This document explores the principles behind the JET guidelines. It then examines some of the issues that might arise in adapting them to alphabetic languages, i.e., to languages whose characters primarily represent sounds rather than meanings. This document describes five things: 1. The general background and considerations for non-ASCII scripts in names. 2. Suggested practices for describing character variants. 3. A method for using a zone's character variants to determine which names should be associated with a registration. 4. A format for publishing a zone's table of character variants; Such tables are referred to below simply as "language tables" or simply "tables". 5. A model algorithm for name registration given the presence of language tables.
and confusion may be reduced -- both between registries and for users and registrars who have relationships with more than one domain. Just as the JET Guidelines contain some suggestions that may not be applicable to alphabetic scripts, some of the suggestions here, especially the more specific ones, may be applicable to some scripts and not others. Section 1.4.1) and an authority that has chosen to use that code and establish a character-listing for it. Authorities are normally TLD (top-level
domain) registries; see Section 7 and [IANA-language-registry]. However, it is expected that TLD registries will find appropriate experts and that advice from language and script experts selected by international neutral bodies will also become part of the registration system. In addition, as discussed below in Section 7, registries may conclude that the best interests of registrants, stakeholders, and the Internet community would be served by constructing "language tables" that mix scripts and characters in ways that conform to no known language. Conventions should be developed for such registrations that do not misleadingly reflect specific language codes. Section 5, the base characters occupy the first column. Normally (and always, if the recommendation of Section 6.3 is adopted), the base characters will be the characters that appear in registration requests from registrants; any other character will invalidate the registration attempt. * Native Script Native script is the form in which the relevant string would normally be represented. For example, it might use Lower Slobbovian characters and the glyphs normally used to write them. It would not be punycode as a presentation form. * Variant Characters/Strings The "variant(s)" are character(s) and/or string(s) that are treated as equivalent to the base character. Note that these might not be exactly equivalent characters; a particular
original character may be a base character with a mapping to a particular variant character, but that variant character may not have a mapping to the original base character. Indeed, the variant character may not appear in the base character list, and hence may not be valid for use in a registration. Usually, characters or strings to be designated as variants are considered either equivalent or sufficiently similar (by some registry-specific definition) that confusion between them and the base character might occur. * Base Registration The "base registration" is the single name that the registrant requested from the registry. The JET Guidelines use the term "label string" for this concept. * Registered, Activated A label (or "name") is described as "registered" if it is actually entered into a domain (i.e., into a zone file) by the registry, so that it can be accessed and resolved using standard DNS tools. The JET Guidelines describe a "registered" label as "activated". However, some domains use a slightly different registration logic in which a name can be registered with the registrar (if one is involved) and with the registry, but not actually entered into the zone file until an additional activation or delegation step occurs. This document does not make that distinction, but is compatible with it. As specified in the IDNA Standard, the name actually placed in the zone file is always the internal ("punycode") form. There is no provision for actually entering any other form of an IDN into the DNS. It remains controversial, with different registrars and registries having adopted different policies, as to whether the registration, as submitted by the registrant, is in the form of: o The native-script name, either in UTF-8 or in some coding specified by the registrar, or o the internal-form ("punycode") name, or o both forms of the name together, so that the registrar and registry can verify the intended translation.
If any of the approaches defined in this document is used, it is almost certain to be necessary that the native-script form of the requested string be available to the registry. * Registration Bundle A "registration bundle" is the set of all labels that come from expanding the base characters for a single name into their variants. The presence of a label in a registration bundle does not imply that it is registered. In the JET Guidelines, a registration bundle is called an "IDN Package". * Reserved Label A "reserved label" is a label in a registration bundle that is not actually registered. * Registry" A "registry" is the administrative authority for a DNS zone. The registry is the body that enforces, and typically makes, policies that are used in a particular zone in the DNS. * Coded Character Set A "Coded Character Set" (CCS) is a list of characters and the code positions assigned to them. ASCII and Unicode are CCSs. * Language A "language" is something spoken by humans, independent of how it is written or coded. ISO Standard 639 and IETF BCP 47 (RFC 3066) [RFC3066] list and define codes for identifying languages. * Script A "script" is a collection of characters (glyphs, independent of coding) that are used together, typically to represent one or more languages. Note that the script for one language may heavily overlap the script for another. This does not imply that they have identical scripts. * Charset "Charset" is an IETF-invented term to describe, more or less, the combination of a script, a CCS that encodes that script,
and rules for serializing encoded bytes that are stored on a computer or transmitted over the network. The last four of these definitions are redundant with, but deliberately somewhat less precise than, the definitions in [RFC3536], which also provides sources. The two sets of definitions are intended to be consistent. RFC3066] or, if the registry considers it more appropriate, a coding based on scripts such as those in [LTRU-Registry]. In this way, Chinese as used on the mainland of the People's Republic of China ("zh-cn") can, at registry option, consist of a somewhat different list of characters (code points) and be represented by a separate table compared to Chinese as used in Taiwan ("zh-tw"). The design of the JET Guidelines took one important constraint as a basis: IDNA was treated as a firm standard. A procedure that modified some portion of the IDNA functions, or was a variant on them, was considered a violation of those standards and should not be encouraged (or, probably, even permitted). Each registry is expected to construct (or obtain) a table for each language it considers relevant and appropriate. These tables list, for the particular zone, the characters permitted for that language. If a character does not appear as a base character (called a "valid code point" in the JET document) in that table, then a name containing it cannot be registered. If multiple languages are listed for the registration, then the character must appear in the tables for each of those languages.
The tables may also contain columns that specify alternate or variant forms of the valid character. If these variants appear, they are used to synthesize labels that are alternatives to the original one. These labels are all reserved and can be registered or "activated" (placed into the DNS) only by the action or request of the original registrant; some (the "preferred variant labels") are typically registered automatically. The zone is expected to establish appropriate policies for situations in which the variant forms of one label conflict with already-reserved or already-registered labels. Most of these concepts were introduced because of concerns about specific issues with CJK characters, beginning from the requirement that the use of Simplified Chinese by some registrants and Traditional Chinese by others not be permitted to create confusion or opportunities for fraud. While they may be applicable to registry tables constructed for alphabetic scripts, the translation should be done with care, since many analogies are not exact. Some of the important issues are discussed in the sections that follow, especially Section 3. The JET model may be considered as a variation on, and inspiration for, the model and method presented by the rest of this document, although the JET model has been completely developed only for CJK characters. Other languages or scripts, especially alphabetic ones, may require other variations. Unicode] [Unicode32] or IDNA cause two strings to appear similar enough to cause confusion, then both should be registered by the same party or one of them should become unregisterable. The definition of "appear similar enough" will differ for different cultures and circumstance, and hence DNS zones, but the principle is fairly general. In the JET model, all of the variant strings are identified, some are registered into the DNS automatically, and others are simply reserved and can be registered, if at all, only by the original registrant. Other zones might find other policies appropriate. For example, a zone might conclude that having similar strings registered in the DNS was undesirable. If so, the list of variant strings would be used only to build a list of names that would be reserved and prohibited from being registered.
Unicode], [Unicode32]), for example, does not define script boundaries at all, even though it is structured in terms of usually-related blocks of characters. The issue is complicated by the common origin of most alphabetic scripts in use in the world today (see, for example, [Drucker] or the more scholarly [Daniels]). Because of that history, certain characters (or, more precisely, symbols representing characters) appear in the scripts associated with multiple languages, sometimes with very different sounds or meanings. This differs from the CJK situation in which, if a character appears in more than one of the relevant languages, it will usually have the same interpretation in each one. For the subset of characters that actually are ideographs or pictographs, pronunciation is expected to vary widely while meaning is preserved. At least in part because of that similarity of meaning, it made sense in the JET case to permit a registration to specify multiple languages, to verify that the characters in the label string (the requested "Base registration") were valid for each, and then to generate variant labels using each language in turn. For many alphabetic languages, it may be more sensible to prohibit the label string submitted for registration from being associated with more than one language. Indeed, "one label, one language" has been suggested as an important barrier against common sources of "look-alike" confusion. For example, the imposition of that rule in a zone would prevent the insertion of a few Greek or Cyrillic characters with shapes identical to the Latin ones into what was otherwise a Latin-based string. For a particular table, the list of base characters may be thought of as the script associated with the relevant language, with the understanding that the table design does not prevent the same character from appearing in the tables for multiple languages. Indeed, this notion of a script that is local and specifically identified can be turned around: so-called "language tables" are associated with languages only insofar as thinking about the character structure and word forms associated with a given language helps to inform the construction of the table. A country like Finland, for example, might select among: o One table each for Finnish, Swedish, and English characters and conventions, permitting a string to be registered in one, two, or
all three languages. However, a three-language registration would necessarily prohibit any characters that did not appear in all three languages, since the label would make little sense otherwise. o One table each, but with a "one label, one language" rule for the zone. o A combined table based on the observation that all three writing systems were based on Roman characters and that the possibilities for confusion of interest to the registry would not be reduced by "language" differentiation. This option raises an interesting issue about language labeling as described in Section 1.4.1; see the discussion in Section 7 below. Regardless of what decisions were made about those languages and scripts, they might have a separate table for registration of labels containing Cyrillic characters. That table might contain some Roman-derived characters (either as base characters or as variants), just as some CJK tables do. See also Section 2, below. Tables that present multiple languages, as described above, have introduced confusion and discomfort among those who have failed to understand these definitions. The consequence of these definitions is that use of a language or script code in a registration is a mnemonic, rather than a normative statement about the language or script itself. When that confusion is likely to occur, it is appropriate to simply use the registry identifier and a sequence number to identify the registration. As the JET Guidelines stress, no tables or systems of this type -- even if identified with a language as a means of defining or describing the table -- can assure linguistic or even syntactic correctness of labels with regard to that language. That assurance may not be possible without human intervention or at least dictionary lookups of complete proposed labels. It may even not be desirable to attempt that level of correctness (see Section 2). Of course, if any language-based tests or constraints, including "one label, one language", are to be applied to limit the associated sources of confusion, each zone must have a table for each language in which it expects to accept registrations. The notion of a single combined table for the zone is, in the general case, simply unworkable. One could use a single table for the zone if the intent were to impose only minimal restrictions, e.g., to force alphabetic and numeric characters only, excluding symbols and punctuation. That type of restriction might be useful in eliminating some problems, such as those of unreadable labels, but it would be unlikely to be
very helpful with, e.g., confusion caused by similar-looking characters.
Certainly, if both are permitted, and permitted to be registered by separate parties, there are many opportunities for confusion. Of course, zone managers should inform all current registrants when the registration policy for the zone changes. This includes the times when IDN characters are first allowed in the zone, when additional characters are permitted, and when any change occurs in the character variant tables. Many languages contain two variants for a character, one of which is strongly preferred. A registry might restrict the base registration to the preferred form, or it might allow any form for the base registration. If the variant tables are created carefully, the resulting bundles will be the same, but some registries will give special status to the base registration such as its appearance in "Whois" databases.
to implement these types of restrictions, but there has been no experience so far with that approach. In particular, in some scripts derived from Roman characters, sequences that have historically been typographically represented by single "ligature" or "digraph" characters may also be represented by the separate characters (e.g., "ae" for U+00E6 or "ij" for U+0133). If it is desired to either prohibit these, or to treat them as variants, some extensions to the single-character JET model may be needed. Some careful thinking about IDNA (especially nameprep) may also be needed, since some of these combinations are excluded there).
registration). For many circumstances, it may be the most attractive option. In all cases, at least the registered label should appear in the zone. It would be almost impossible to describe to name owners why the name that they asked for is not in the zone, but some other name that they now control is. By implication, if the requested label is already registered, the entire registration request must be rejected.
rule (or restriction) would still not avoid the need to consider character variants. Consequently, registries applying the principles outlined in this document should be careful not to apply more severe restrictions than are reasonable and appropriate while, at the same time, being aware of how difficult it usually is to add restrictions at a later time. IESG-IDN] and ICANN [ICANN-IDN] [ICANN-IDN2], have concluded that some restrictions are needed to prevent many forms of user confusion about the actual structure of a name or the word, phrase, or term that it appears to spell out. The best way to approach such restrictions appears to draw from the language and culture of the community of registrants and users in the relevant zone: if particular characters are likely to be surprising or unintelligible to both of those groups, it is probably wise to not permit them to be used in registrations. Registration restrictions can be carried much further than restricting permitted characters to a selected Unicode subset. The idea of a reserved "bundle" of related labels permits probably-confusing combinations or sets of characters to be bound together, under the control of a single registrant. While that registrant might still use the package in a way that confused his or her own users (the approach outlined here
will not prevent either ill-though-out ideas or stupidity), the possibility of turning potential confusion into a hostile attack would be considerably reduced. At the same time, excessive restrictions may make DNS identifiers less useful for their original purpose: identifying particular hosts and similar resources on the network in an orderly way. Registries creating rules and policies about what can be registered in particular zones -- whether those are based on the JET Guidelines or the suggestions in this document -- should balance the need for restrictions against the need for flexibility in constructing identifiers. The discussion above provides many options that could be selected, defined, and applied in different ways in different registries (zones). Registrars and registrants would almost certainly prefer systems in which they can predict, at least to a first order approximation, the implications of a particular potential registration. Predictability of that sort probably requires more standards, and less flexibility, than the model itself might suggest.
The following is an example of how a table might look. The entries in this table are purposely silly and should not be used by any registry as the basis for choosing variants. For the example, assume that the registry: o allows the FOR ALL character (U+2200) with no variants o allows the COMPLEMENT character (U+2201) which has a single variant of LATIN CAPITAL LETTER C (U+0043) o allows the PROPORTION character (U+2237) which has one variant which is the string COLON (U+003A) COLON (U+003A) o allows the PARTIAL DIFFERENTIAL character (U+2202) which has two variants: LATIN SMALL LETTER D (U+0064) and GREEK SMALL LETTER DELTA (U+03B4) The table contents (after any required header information, see [IANA-language-registry] and the discussion in Section 7 below) would look like: # An example of a table U+2200 U+2201|U+0043 U+2237|U+003A-U+003A # Note that the variant is a string U+2202|U+0064:U+03B4 # Two variants for the same character Implementers of table processors should remember that there are tens of thousands of characters whose codepoints are greater than 0xFFFF. Thus, any program that assumes that each character in the table is represented in exactly six octets ("U", "+", and four octets representing the character value) will fail with tables that use characters whose value is greater than 0xFFFF.
one or more labels (always including the base registration). As described earlier, the registration bundle should be stored with its date of creation so that issues with overlapping elements between bundles can later be resolved on a first-come, first-served basis. There are two steps to processing the registration: 1. Check whether the proposed base registration exists in any bundle. If it does, stop immediately with a failure. 2. Process the base registration with the mechanism described as "CreateBundle" in Section 6.1, below. Note that the process must be executed only once. The process must not be performed on any output of the process, only on the proposed base registration.
o Create the set of characters that consists of the candidate character and any variants. o For each character in the set from the previous step, duplicate the temporary bundle that resulted from the previous candidate character, and add the new character to the end of each partial label. 4. The temporary bundle now contains zero or more labels that consist of Unicode characters. For every label in the temporary bundle, do the following: o Process the label with ToASCII to see if ToASCII succeeds. If it does, add the label to the registration bundle. Otherwise, do not process this label from the temporary bundle any further; it will not go into the registration bundle. The result of the processing outlined above is the registration bundle with the base registration and possibly other labels. Section 5 becomes a trivial listing of base characters and only the first two steps of CreateBundle (verifying that all candidate character are in the base ("valid") character list and verifying that the resulting characters will succeed in the ToASCII operation) are applicable. Even the second of those steps becomes pro forma if the advice in the next subsection is followed.
While having these mapping functions available during lookup may be quite helpful to users who type equivalent forms, registrations are probably best performed in terms of the IDNA base characters only, i.e., those characters that nameprep will not change. This will have two advantages. o Registrants will never find themselves in the rather confusing position of having submitted one string for registration and finding a different string in the registry database (which could otherwise occur even if the relevant language table does not contain variants). o Those who are interested in what characters are permitted by a given registry will only need to examine the relevant tables, rather than simulating the IDNA algorithm to determine the result of processing particular characters. IANA-language-registry]. Since the registry exists and is being managed under ICANN direction, the material that follows is a review of the theory of this registry, rather than new instructions for IANA. As described above and suggested in the JET Guidelines, the registration rules generally require only that: o The application be submitted or endorsed by a TLD registry, to ensure that someone cares about the particular table. o The table be identified by the following: * the name -- usually the top-level domain name -- of the submitting or endorsing registry; * one of: a language designation (consistent with [RFC3066] or with some other system approved by the IANA), a script designation, a combination of the two, or a sequence number acceptable to IANA for this purpose; * a version number; and * a date. o Characters listed in the table be identified by Unicode code points, as discussed above.
o The table format may correspond to that identified in [RFC3743], or in Section 5 above, or may be some variation on those themes appropriate to the local processing model (with or without variants). This raises some issues that will need to be worked out as experiences accumulate. For example, more standardization of table formats would be desirable to allow processing by the same computer tools for different registries and languages. But standardization seems premature at this time due to differences in languages, processing, and requirements and lack of experience with them. Similarly, if a registry concludes that it should use a table that contains characters from several scripts, it is not clear how such a table should be designated. Identifying it with a language code (either according to [RFC3066] or an independent code registered with IANA) is likely to just introduce more confusion, especially given other Internet uses of the language codes. It appears that some other convention will be needed for those cases, and it should be developed (if it has not already been established by the time this document is published).
While the increased number and types of characters made available by Unicode considerably increases the scale of the potential problems, the problems addressed by this document are not new. No plausible set of restrictions will eliminate all problems and sources of confusion: for example, it has often been pointed out that, even in ASCII, the characters digit-one ("1") and lower case L ("l") can easily be confused in some display fonts. But, to the degree to which security may be aided by sensible risk reduction, these techniques may be helpful. Hoffman-reg] shifted to inclusion of language-based approaches. The current version of this document incorporates considerable text, and even more ideas, from those drafts, with Paul Hoffman's generous permission. Feedback was provided by several registry operators (of both country code and generic TLDs), including Edmon Chung and Ram Mohan of Afilias, and by ICANN and IANA staff, notably Tina Dam and Theresa Swinehart. This feedback about issues encountered in registering tables and designing IDN implementations resulted in the addition of significant clarifying text to the current version of the document. The opinions expressed here are the sole responsibility of the author. Some of those whose ideas and comments are reflected in this document may disagree with the conclusions the author has drawn from them. The first draft version of this document was posted in June 2003.
[Daniels] P.T. Daniels and W. Bright, The World's Writing Systems, Oxford: Oxford University Press: 1996. [Drucker] Drucker, J., "The Alphabetic Labyrinth: The Letters in History and Imagination", 1995. [Hoffman-reg] Hoffman, P., "A Method for Registering Internationalized Domain Names", Work in Progress, October 2003. [IESG-IDN] Internet Engineering Steering Group, IETF, "IESG Statement on IDN", IESG Statement available from http://www.ietf.org/IESG/STATEMENTS/IDNstatement.txt, February 2003. [ICANN-IDN] Internet Corporation for Assigned Names and Numbers (ICANN), "Guidelines for the Implementation of Internationalized Domain Names, Version 1.0", June 2003. [ICANN-IDN2] Internet Corporation for Assigned Names and Numbers (ICANN), "Guidelines for the Implementation of Internationalized Domain Names, Version 2.0", September 2005. [IANA-language-registry] Internet Assigned Numbers Authority (IANA), "IDN Language Table Registry", April 2004. [LTRU-Registry] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying Languages", Work in Progress, October 2005. [RFC952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet host table specification", RFC 952, October 1985. [RFC1035] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987. [RFC3066] Alvestrand, H., "Tags for the Identification of Languages", BCP 47, RFC 3066, January 2001. [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, "Internationalizing Domain Names in Applications (IDNA)", RFC 3490, March 2003.
[RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep Profile for Internationalized Domain Names (IDN)", RFC 3491, March 2003. [RFC3492] Costello, A., "Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA)", RFC 3492, March 2003. [RFC3536] Hoffman, P., "Terminology Used in Internationalization in the IETF", RFC 3536, May 2003. [RFC3743] Konishi, K., Huang, K., Qian, H., and Y. Ko, "Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean", RFC 3743, April 2004. [Unicode] The Unicode Consortium, "The Unicode Standard -- Version 3.0", January 2000. [Unicode32] The Unicode Consortium, "Unicode Standard Annex #28: Unicode 3.2", March 2002.
Full Copyright Statement Copyright (C) The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78 and at www.rfc-editor.org/copyright.html, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf- email@example.com. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.