xref: /freebsd/crypto/heimdal/lib/wind/rfc3491.txt (revision 6a068746777241722b2b32c5d0bc443a2a64d80b)
1*ae771770SStanislav Sedov
2*ae771770SStanislav Sedov
3*ae771770SStanislav Sedov
4*ae771770SStanislav Sedov
5*ae771770SStanislav Sedov
6*ae771770SStanislav Sedov
7*ae771770SStanislav SedovNetwork Working Group                                         P. Hoffman
8*ae771770SStanislav SedovRequest for Comments: 3491                                    IMC & VPNC
9*ae771770SStanislav SedovCategory: Standards Track                                    M. Blanchet
10*ae771770SStanislav Sedov                                                                Viagenie
11*ae771770SStanislav Sedov                                                              March 2003
12*ae771770SStanislav Sedov
13*ae771770SStanislav Sedov
14*ae771770SStanislav Sedov                   Nameprep: A Stringprep Profile for
15*ae771770SStanislav Sedov                  Internationalized Domain Names (IDN)
16*ae771770SStanislav Sedov
17*ae771770SStanislav SedovStatus of this Memo
18*ae771770SStanislav Sedov
19*ae771770SStanislav Sedov   This document specifies an Internet standards track protocol for the
20*ae771770SStanislav Sedov   Internet community, and requests discussion and suggestions for
21*ae771770SStanislav Sedov   improvements.  Please refer to the current edition of the "Internet
22*ae771770SStanislav Sedov   Official Protocol Standards" (STD 1) for the standardization state
23*ae771770SStanislav Sedov   and status of this protocol.  Distribution of this memo is unlimited.
24*ae771770SStanislav Sedov
25*ae771770SStanislav SedovCopyright Notice
26*ae771770SStanislav Sedov
27*ae771770SStanislav Sedov   Copyright (C) The Internet Society (2003).  All Rights Reserved.
28*ae771770SStanislav Sedov
29*ae771770SStanislav SedovAbstract
30*ae771770SStanislav Sedov
31*ae771770SStanislav Sedov   This document describes how to prepare internationalized domain name
32*ae771770SStanislav Sedov   (IDN) labels in order to increase the likelihood that name input and
33*ae771770SStanislav Sedov   name comparison work in ways that make sense for typical users
34*ae771770SStanislav Sedov   throughout the world.  This profile of the stringprep protocol is
35*ae771770SStanislav Sedov   used as part of a suite of on-the-wire protocols for
36*ae771770SStanislav Sedov   internationalizing the Domain Name System (DNS).
37*ae771770SStanislav Sedov
38*ae771770SStanislav Sedov1. Introduction
39*ae771770SStanislav Sedov
40*ae771770SStanislav Sedov   This document specifies processing rules that will allow users to
41*ae771770SStanislav Sedov   enter internationalized domain names (IDNs) into applications and
42*ae771770SStanislav Sedov   have the highest chance of getting the content of the strings
43*ae771770SStanislav Sedov   correct.  It is a profile of stringprep [STRINGPREP].  These
44*ae771770SStanislav Sedov   processing rules are only intended for internationalized domain
45*ae771770SStanislav Sedov   names, not for arbitrary text.
46*ae771770SStanislav Sedov
47*ae771770SStanislav Sedov   This profile defines the following, as required by [STRINGPREP].
48*ae771770SStanislav Sedov
49*ae771770SStanislav Sedov   -  The intended applicability of the profile: internationalized
50*ae771770SStanislav Sedov      domain names processed by IDNA.
51*ae771770SStanislav Sedov
52*ae771770SStanislav Sedov   -  The character repertoire that is the input and output to
53*ae771770SStanislav Sedov      stringprep:  Unicode 3.2, specified in section 2.
54*ae771770SStanislav Sedov
55*ae771770SStanislav Sedov
56*ae771770SStanislav Sedov
57*ae771770SStanislav Sedov
58*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 1]
59*ae771770SStanislav Sedov
60*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
61*ae771770SStanislav Sedov
62*ae771770SStanislav Sedov
63*ae771770SStanislav Sedov   -  The mappings used: specified in section 3.
64*ae771770SStanislav Sedov
65*ae771770SStanislav Sedov   -  The Unicode normalization used: specified in section 4.
66*ae771770SStanislav Sedov
67*ae771770SStanislav Sedov   -  The characters that are prohibited as output: specified in section
68*ae771770SStanislav Sedov      5.
69*ae771770SStanislav Sedov
70*ae771770SStanislav Sedov   -  Bidirectional character handling: specified in section 6.
71*ae771770SStanislav Sedov
72*ae771770SStanislav Sedov1.1 Interaction of protocol parts
73*ae771770SStanislav Sedov
74*ae771770SStanislav Sedov   Nameprep is used by the IDNA [IDNA] protocol for preparing domain
75*ae771770SStanislav Sedov   names; it is not designed for any other purpose.  It is explicitly
76*ae771770SStanislav Sedov   not designed for processing arbitrary free text and SHOULD NOT be
77*ae771770SStanislav Sedov   used for that purpose.  Nameprep is a profile of Stringprep
78*ae771770SStanislav Sedov   [STRINGPREP].  Implementations of Nameprep MUST fully implement
79*ae771770SStanislav Sedov   Stringprep.
80*ae771770SStanislav Sedov
81*ae771770SStanislav Sedov   Nameprep is used to process domain name labels, not domain names.
82*ae771770SStanislav Sedov   IDNA calls nameprep for each label in a domain name, not for the
83*ae771770SStanislav Sedov   whole domain name.
84*ae771770SStanislav Sedov
85*ae771770SStanislav Sedov1.2 Terminology
86*ae771770SStanislav Sedov
87*ae771770SStanislav Sedov   The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY"
88*ae771770SStanislav Sedov   in this document are to be interpreted as described in BCP 14, RFC
89*ae771770SStanislav Sedov   2119 [RFC2119].
90*ae771770SStanislav Sedov
91*ae771770SStanislav Sedov2. Character Repertoire
92*ae771770SStanislav Sedov
93*ae771770SStanislav Sedov   This profile uses Unicode 3.2, as defined in [STRINGPREP] Appendix A.
94*ae771770SStanislav Sedov
95*ae771770SStanislav Sedov3. Mapping
96*ae771770SStanislav Sedov
97*ae771770SStanislav Sedov   This profile specifies mapping using the following tables from
98*ae771770SStanislav Sedov   [STRINGPREP]:
99*ae771770SStanislav Sedov
100*ae771770SStanislav Sedov   Table B.1
101*ae771770SStanislav Sedov   Table B.2
102*ae771770SStanislav Sedov
103*ae771770SStanislav Sedov4. Normalization
104*ae771770SStanislav Sedov
105*ae771770SStanislav Sedov   This profile specifies using Unicode normalization form KC, as
106*ae771770SStanislav Sedov   described in [STRINGPREP].
107*ae771770SStanislav Sedov
108*ae771770SStanislav Sedov
109*ae771770SStanislav Sedov
110*ae771770SStanislav Sedov
111*ae771770SStanislav Sedov
112*ae771770SStanislav Sedov
113*ae771770SStanislav Sedov
114*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 2]
115*ae771770SStanislav Sedov
116*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
117*ae771770SStanislav Sedov
118*ae771770SStanislav Sedov
119*ae771770SStanislav Sedov5. Prohibited Output
120*ae771770SStanislav Sedov
121*ae771770SStanislav Sedov   This profile specifies prohibiting using the following tables from
122*ae771770SStanislav Sedov   [STRINGPREP]:
123*ae771770SStanislav Sedov
124*ae771770SStanislav Sedov   Table C.1.2
125*ae771770SStanislav Sedov   Table C.2.2
126*ae771770SStanislav Sedov   Table C.3
127*ae771770SStanislav Sedov   Table C.4
128*ae771770SStanislav Sedov   Table C.5
129*ae771770SStanislav Sedov   Table C.6
130*ae771770SStanislav Sedov   Table C.7
131*ae771770SStanislav Sedov   Table C.8
132*ae771770SStanislav Sedov   Table C.9
133*ae771770SStanislav Sedov
134*ae771770SStanislav Sedov   IMPORTANT NOTE: This profile MUST be used with the IDNA protocol.
135*ae771770SStanislav Sedov   The IDNA protocol has additional prohibitions that are checked
136*ae771770SStanislav Sedov   outside of this profile.
137*ae771770SStanislav Sedov
138*ae771770SStanislav Sedov6. Bidirectional characters
139*ae771770SStanislav Sedov
140*ae771770SStanislav Sedov   This profile specifies checking bidirectional strings as described in
141*ae771770SStanislav Sedov   [STRINGPREP] section 6.
142*ae771770SStanislav Sedov
143*ae771770SStanislav Sedov7. Unassigned Code Points in Internationalized Domain Names
144*ae771770SStanislav Sedov
145*ae771770SStanislav Sedov   If the processing in [IDNA] specifies that a list of unassigned code
146*ae771770SStanislav Sedov   points be used, the system uses table A.1 from [STRINGPREP] as its
147*ae771770SStanislav Sedov   list of unassigned code points.
148*ae771770SStanislav Sedov
149*ae771770SStanislav Sedov8. References
150*ae771770SStanislav Sedov
151*ae771770SStanislav Sedov8.1 Normative References
152*ae771770SStanislav Sedov
153*ae771770SStanislav Sedov   [RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
154*ae771770SStanislav Sedov                Requirement Levels", BCP 14, RFC 2119, March 1997.
155*ae771770SStanislav Sedov
156*ae771770SStanislav Sedov   [STRINGPREP] Hoffman, P. and M. Blanchet, "Preparation of
157*ae771770SStanislav Sedov                Internationalized Strings ("stringprep")", RFC 3454,
158*ae771770SStanislav Sedov                December 2002.
159*ae771770SStanislav Sedov
160*ae771770SStanislav Sedov   [IDNA]       Faltstrom, P., Hoffman, P. and A. Costello,
161*ae771770SStanislav Sedov                "Internationalizing Domain Names in Applications
162*ae771770SStanislav Sedov                (IDNA)", RFC 3490, March 2003.
163*ae771770SStanislav Sedov
164*ae771770SStanislav Sedov
165*ae771770SStanislav Sedov
166*ae771770SStanislav Sedov
167*ae771770SStanislav Sedov
168*ae771770SStanislav Sedov
169*ae771770SStanislav Sedov
170*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 3]
171*ae771770SStanislav Sedov
172*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
173*ae771770SStanislav Sedov
174*ae771770SStanislav Sedov
175*ae771770SStanislav Sedov8.2 Informative references
176*ae771770SStanislav Sedov
177*ae771770SStanislav Sedov   [STD13]      Mockapetris, P., "Domain names - concepts and
178*ae771770SStanislav Sedov                facilities", STD 13, RFC 1034, and "Domain names -
179*ae771770SStanislav Sedov                implementation and specification", STD 13, RFC 1035,
180*ae771770SStanislav Sedov                November 1987.
181*ae771770SStanislav Sedov
182*ae771770SStanislav Sedov9. Security Considerations
183*ae771770SStanislav Sedov
184*ae771770SStanislav Sedov   The Unicode and ISO/IEC 10646 repertoires have many characters that
185*ae771770SStanislav Sedov   look similar.  In many cases, users of security protocols might do
186*ae771770SStanislav Sedov   visual matching, such as when comparing the names of trusted third
187*ae771770SStanislav Sedov   parties.  Because it is impossible to map similar-looking characters
188*ae771770SStanislav Sedov   without a great deal of context such as knowing the fonts used,
189*ae771770SStanislav Sedov   stringprep does nothing to map similar-looking characters together
190*ae771770SStanislav Sedov   nor to prohibit some characters because they look like others.
191*ae771770SStanislav Sedov
192*ae771770SStanislav Sedov   Security on the Internet partly relies on the DNS.  Thus, any change
193*ae771770SStanislav Sedov   to the characteristics of the DNS can change the security of much of
194*ae771770SStanislav Sedov   the Internet.
195*ae771770SStanislav Sedov
196*ae771770SStanislav Sedov   Domain names are used by users to connect to Internet servers.  The
197*ae771770SStanislav Sedov   security of the Internet would be compromised if a user entering a
198*ae771770SStanislav Sedov   single internationalized name could be connected to different servers
199*ae771770SStanislav Sedov   based on different interpretations of the internationalized domain
200*ae771770SStanislav Sedov   name.
201*ae771770SStanislav Sedov
202*ae771770SStanislav Sedov   Current applications might assume that the characters allowed in
203*ae771770SStanislav Sedov   domain names will always be the same as they are in [STD13].  This
204*ae771770SStanislav Sedov   document vastly increases the number of characters available in
205*ae771770SStanislav Sedov   domain names.  Every program that uses "special" characters in
206*ae771770SStanislav Sedov   conjunction with domain names may be vulnerable to attack based on
207*ae771770SStanislav Sedov   the new characters allowed by this specification.
208*ae771770SStanislav Sedov
209*ae771770SStanislav Sedov
210*ae771770SStanislav Sedov
211*ae771770SStanislav Sedov
212*ae771770SStanislav Sedov
213*ae771770SStanislav Sedov
214*ae771770SStanislav Sedov
215*ae771770SStanislav Sedov
216*ae771770SStanislav Sedov
217*ae771770SStanislav Sedov
218*ae771770SStanislav Sedov
219*ae771770SStanislav Sedov
220*ae771770SStanislav Sedov
221*ae771770SStanislav Sedov
222*ae771770SStanislav Sedov
223*ae771770SStanislav Sedov
224*ae771770SStanislav Sedov
225*ae771770SStanislav Sedov
226*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 4]
227*ae771770SStanislav Sedov
228*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
229*ae771770SStanislav Sedov
230*ae771770SStanislav Sedov
231*ae771770SStanislav Sedov10. IANA Considerations
232*ae771770SStanislav Sedov
233*ae771770SStanislav Sedov   This is a profile of stringprep.  It has been registered by the IANA
234*ae771770SStanislav Sedov   in the stringprep profile registry
235*ae771770SStanislav Sedov   (www.iana.org/assignments/stringprep-profiles).
236*ae771770SStanislav Sedov
237*ae771770SStanislav Sedov      Name of this profile:
238*ae771770SStanislav Sedov         Nameprep
239*ae771770SStanislav Sedov
240*ae771770SStanislav Sedov      RFC in which the profile is defined:
241*ae771770SStanislav Sedov         This document.
242*ae771770SStanislav Sedov
243*ae771770SStanislav Sedov      Indicator whether or not this is the newest version of the
244*ae771770SStanislav Sedov      profile:
245*ae771770SStanislav Sedov         This is the first version of Nameprep.
246*ae771770SStanislav Sedov
247*ae771770SStanislav Sedov11. Acknowledgements
248*ae771770SStanislav Sedov
249*ae771770SStanislav Sedov   Many people from the IETF IDN Working Group and the Unicode Technical
250*ae771770SStanislav Sedov   Committee contributed ideas that went into this document.
251*ae771770SStanislav Sedov
252*ae771770SStanislav Sedov   The IDN Nameprep design team made many useful changes to the
253*ae771770SStanislav Sedov   document.  That team and its advisors include:
254*ae771770SStanislav Sedov
255*ae771770SStanislav Sedov      Asmus Freytag
256*ae771770SStanislav Sedov      Cathy Wissink
257*ae771770SStanislav Sedov      Francois Yergeau
258*ae771770SStanislav Sedov      James Seng
259*ae771770SStanislav Sedov      Marc Blanchet
260*ae771770SStanislav Sedov      Mark Davis
261*ae771770SStanislav Sedov      Martin Duerst
262*ae771770SStanislav Sedov      Patrik Faltstrom
263*ae771770SStanislav Sedov      Paul Hoffman
264*ae771770SStanislav Sedov
265*ae771770SStanislav Sedov   Additional significant improvements were proposed by:
266*ae771770SStanislav Sedov
267*ae771770SStanislav Sedov      Jonathan Rosenne
268*ae771770SStanislav Sedov      Kent Karlsson
269*ae771770SStanislav Sedov      Scott Hollenbeck
270*ae771770SStanislav Sedov      Dave Crocker
271*ae771770SStanislav Sedov      Erik Nordmark
272*ae771770SStanislav Sedov      Matitiahu Allouche
273*ae771770SStanislav Sedov
274*ae771770SStanislav Sedov
275*ae771770SStanislav Sedov
276*ae771770SStanislav Sedov
277*ae771770SStanislav Sedov
278*ae771770SStanislav Sedov
279*ae771770SStanislav Sedov
280*ae771770SStanislav Sedov
281*ae771770SStanislav Sedov
282*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 5]
283*ae771770SStanislav Sedov
284*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
285*ae771770SStanislav Sedov
286*ae771770SStanislav Sedov
287*ae771770SStanislav Sedov12. Authors' Addresses
288*ae771770SStanislav Sedov
289*ae771770SStanislav Sedov   Paul Hoffman
290*ae771770SStanislav Sedov   Internet Mail Consortium and VPN Consortium
291*ae771770SStanislav Sedov   127 Segre Place
292*ae771770SStanislav Sedov   Santa Cruz, CA  95060 USA
293*ae771770SStanislav Sedov
294*ae771770SStanislav Sedov   EMail: paul.hoffman@imc.org and paul.hoffman@vpnc.org
295*ae771770SStanislav Sedov
296*ae771770SStanislav Sedov
297*ae771770SStanislav Sedov   Marc Blanchet
298*ae771770SStanislav Sedov   Viagenie inc.
299*ae771770SStanislav Sedov   2875 boul. Laurier, bur. 300
300*ae771770SStanislav Sedov   Ste-Foy, Quebec, Canada, G1V 2M2
301*ae771770SStanislav Sedov
302*ae771770SStanislav Sedov   EMail: Marc.Blanchet@viagenie.qc.ca
303*ae771770SStanislav Sedov
304*ae771770SStanislav Sedov
305*ae771770SStanislav Sedov
306*ae771770SStanislav Sedov
307*ae771770SStanislav Sedov
308*ae771770SStanislav Sedov
309*ae771770SStanislav Sedov
310*ae771770SStanislav Sedov
311*ae771770SStanislav Sedov
312*ae771770SStanislav Sedov
313*ae771770SStanislav Sedov
314*ae771770SStanislav Sedov
315*ae771770SStanislav Sedov
316*ae771770SStanislav Sedov
317*ae771770SStanislav Sedov
318*ae771770SStanislav Sedov
319*ae771770SStanislav Sedov
320*ae771770SStanislav Sedov
321*ae771770SStanislav Sedov
322*ae771770SStanislav Sedov
323*ae771770SStanislav Sedov
324*ae771770SStanislav Sedov
325*ae771770SStanislav Sedov
326*ae771770SStanislav Sedov
327*ae771770SStanislav Sedov
328*ae771770SStanislav Sedov
329*ae771770SStanislav Sedov
330*ae771770SStanislav Sedov
331*ae771770SStanislav Sedov
332*ae771770SStanislav Sedov
333*ae771770SStanislav Sedov
334*ae771770SStanislav Sedov
335*ae771770SStanislav Sedov
336*ae771770SStanislav Sedov
337*ae771770SStanislav Sedov
338*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 6]
339*ae771770SStanislav Sedov
340*ae771770SStanislav SedovRFC 3491                      IDN Nameprep                    March 2003
341*ae771770SStanislav Sedov
342*ae771770SStanislav Sedov
343*ae771770SStanislav Sedov13.  Full Copyright Statement
344*ae771770SStanislav Sedov
345*ae771770SStanislav Sedov   Copyright (C) The Internet Society (2003).  All Rights Reserved.
346*ae771770SStanislav Sedov
347*ae771770SStanislav Sedov   This document and translations of it may be copied and furnished to
348*ae771770SStanislav Sedov   others, and derivative works that comment on or otherwise explain it
349*ae771770SStanislav Sedov   or assist in its implementation may be prepared, copied, published
350*ae771770SStanislav Sedov   and distributed, in whole or in part, without restriction of any
351*ae771770SStanislav Sedov   kind, provided that the above copyright notice and this paragraph are
352*ae771770SStanislav Sedov   included on all such copies and derivative works.  However, this
353*ae771770SStanislav Sedov   document itself may not be modified in any way, such as by removing
354*ae771770SStanislav Sedov   the copyright notice or references to the Internet Society or other
355*ae771770SStanislav Sedov   Internet organizations, except as needed for the purpose of
356*ae771770SStanislav Sedov   developing Internet standards in which case the procedures for
357*ae771770SStanislav Sedov   copyrights defined in the Internet Standards process must be
358*ae771770SStanislav Sedov   followed, or as required to translate it into languages other than
359*ae771770SStanislav Sedov   English.
360*ae771770SStanislav Sedov
361*ae771770SStanislav Sedov   The limited permissions granted above are perpetual and will not be
362*ae771770SStanislav Sedov   revoked by the Internet Society or its successors or assigns.
363*ae771770SStanislav Sedov
364*ae771770SStanislav Sedov   This document and the information contained herein is provided on an
365*ae771770SStanislav Sedov   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
366*ae771770SStanislav Sedov   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
367*ae771770SStanislav Sedov   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
368*ae771770SStanislav Sedov   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
369*ae771770SStanislav Sedov   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
370*ae771770SStanislav Sedov
371*ae771770SStanislav SedovAcknowledgement
372*ae771770SStanislav Sedov
373*ae771770SStanislav Sedov   Funding for the RFC Editor function is currently provided by the
374*ae771770SStanislav Sedov   Internet Society.
375*ae771770SStanislav Sedov
376*ae771770SStanislav Sedov
377*ae771770SStanislav Sedov
378*ae771770SStanislav Sedov
379*ae771770SStanislav Sedov
380*ae771770SStanislav Sedov
381*ae771770SStanislav Sedov
382*ae771770SStanislav Sedov
383*ae771770SStanislav Sedov
384*ae771770SStanislav Sedov
385*ae771770SStanislav Sedov
386*ae771770SStanislav Sedov
387*ae771770SStanislav Sedov
388*ae771770SStanislav Sedov
389*ae771770SStanislav Sedov
390*ae771770SStanislav Sedov
391*ae771770SStanislav Sedov
392*ae771770SStanislav Sedov
393*ae771770SStanislav Sedov
394*ae771770SStanislav SedovHoffman & Blanchet          Standards Track                     [Page 7]
395*ae771770SStanislav Sedov
396