1*ae771770SStanislav Sedov 2*ae771770SStanislav Sedov 3*ae771770SStanislav Sedov 4*ae771770SStanislav Sedov 5*ae771770SStanislav Sedov 6*ae771770SStanislav Sedov 7*ae771770SStanislav SedovNetwork Working Group P. Hoffman 8*ae771770SStanislav SedovRequest for Comments: 3491 IMC & VPNC 9*ae771770SStanislav SedovCategory: Standards Track M. Blanchet 10*ae771770SStanislav Sedov Viagenie 11*ae771770SStanislav Sedov March 2003 12*ae771770SStanislav Sedov 13*ae771770SStanislav Sedov 14*ae771770SStanislav Sedov Nameprep: A Stringprep Profile for 15*ae771770SStanislav Sedov Internationalized Domain Names (IDN) 16*ae771770SStanislav Sedov 17*ae771770SStanislav SedovStatus of this Memo 18*ae771770SStanislav Sedov 19*ae771770SStanislav Sedov This document specifies an Internet standards track protocol for the 20*ae771770SStanislav Sedov Internet community, and requests discussion and suggestions for 21*ae771770SStanislav Sedov improvements. Please refer to the current edition of the "Internet 22*ae771770SStanislav Sedov Official Protocol Standards" (STD 1) for the standardization state 23*ae771770SStanislav Sedov and status of this protocol. Distribution of this memo is unlimited. 24*ae771770SStanislav Sedov 25*ae771770SStanislav SedovCopyright Notice 26*ae771770SStanislav Sedov 27*ae771770SStanislav Sedov Copyright (C) The Internet Society (2003). All Rights Reserved. 28*ae771770SStanislav Sedov 29*ae771770SStanislav SedovAbstract 30*ae771770SStanislav Sedov 31*ae771770SStanislav Sedov This document describes how to prepare internationalized domain name 32*ae771770SStanislav Sedov (IDN) labels in order to increase the likelihood that name input and 33*ae771770SStanislav Sedov name comparison work in ways that make sense for typical users 34*ae771770SStanislav Sedov throughout the world. This profile of the stringprep protocol is 35*ae771770SStanislav Sedov used as part of a suite of on-the-wire protocols for 36*ae771770SStanislav Sedov internationalizing the Domain Name System (DNS). 37*ae771770SStanislav Sedov 38*ae771770SStanislav Sedov1. Introduction 39*ae771770SStanislav Sedov 40*ae771770SStanislav Sedov This document specifies processing rules that will allow users to 41*ae771770SStanislav Sedov enter internationalized domain names (IDNs) into applications and 42*ae771770SStanislav Sedov have the highest chance of getting the content of the strings 43*ae771770SStanislav Sedov correct. It is a profile of stringprep [STRINGPREP]. These 44*ae771770SStanislav Sedov processing rules are only intended for internationalized domain 45*ae771770SStanislav Sedov names, not for arbitrary text. 46*ae771770SStanislav Sedov 47*ae771770SStanislav Sedov This profile defines the following, as required by [STRINGPREP]. 48*ae771770SStanislav Sedov 49*ae771770SStanislav Sedov - The intended applicability of the profile: internationalized 50*ae771770SStanislav Sedov domain names processed by IDNA. 51*ae771770SStanislav Sedov 52*ae771770SStanislav Sedov - The character repertoire that is the input and output to 53*ae771770SStanislav Sedov stringprep: Unicode 3.2, specified in section 2. 54*ae771770SStanislav Sedov 55*ae771770SStanislav Sedov 56*ae771770SStanislav Sedov 57*ae771770SStanislav Sedov 58*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 1] 59*ae771770SStanislav Sedov 60*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 61*ae771770SStanislav Sedov 62*ae771770SStanislav Sedov 63*ae771770SStanislav Sedov - The mappings used: specified in section 3. 64*ae771770SStanislav Sedov 65*ae771770SStanislav Sedov - The Unicode normalization used: specified in section 4. 66*ae771770SStanislav Sedov 67*ae771770SStanislav Sedov - The characters that are prohibited as output: specified in section 68*ae771770SStanislav Sedov 5. 69*ae771770SStanislav Sedov 70*ae771770SStanislav Sedov - Bidirectional character handling: specified in section 6. 71*ae771770SStanislav Sedov 72*ae771770SStanislav Sedov1.1 Interaction of protocol parts 73*ae771770SStanislav Sedov 74*ae771770SStanislav Sedov Nameprep is used by the IDNA [IDNA] protocol for preparing domain 75*ae771770SStanislav Sedov names; it is not designed for any other purpose. It is explicitly 76*ae771770SStanislav Sedov not designed for processing arbitrary free text and SHOULD NOT be 77*ae771770SStanislav Sedov used for that purpose. Nameprep is a profile of Stringprep 78*ae771770SStanislav Sedov [STRINGPREP]. Implementations of Nameprep MUST fully implement 79*ae771770SStanislav Sedov Stringprep. 80*ae771770SStanislav Sedov 81*ae771770SStanislav Sedov Nameprep is used to process domain name labels, not domain names. 82*ae771770SStanislav Sedov IDNA calls nameprep for each label in a domain name, not for the 83*ae771770SStanislav Sedov whole domain name. 84*ae771770SStanislav Sedov 85*ae771770SStanislav Sedov1.2 Terminology 86*ae771770SStanislav Sedov 87*ae771770SStanislav Sedov The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" 88*ae771770SStanislav Sedov in this document are to be interpreted as described in BCP 14, RFC 89*ae771770SStanislav Sedov 2119 [RFC2119]. 90*ae771770SStanislav Sedov 91*ae771770SStanislav Sedov2. Character Repertoire 92*ae771770SStanislav Sedov 93*ae771770SStanislav Sedov This profile uses Unicode 3.2, as defined in [STRINGPREP] Appendix A. 94*ae771770SStanislav Sedov 95*ae771770SStanislav Sedov3. Mapping 96*ae771770SStanislav Sedov 97*ae771770SStanislav Sedov This profile specifies mapping using the following tables from 98*ae771770SStanislav Sedov [STRINGPREP]: 99*ae771770SStanislav Sedov 100*ae771770SStanislav Sedov Table B.1 101*ae771770SStanislav Sedov Table B.2 102*ae771770SStanislav Sedov 103*ae771770SStanislav Sedov4. Normalization 104*ae771770SStanislav Sedov 105*ae771770SStanislav Sedov This profile specifies using Unicode normalization form KC, as 106*ae771770SStanislav Sedov described in [STRINGPREP]. 107*ae771770SStanislav Sedov 108*ae771770SStanislav Sedov 109*ae771770SStanislav Sedov 110*ae771770SStanislav Sedov 111*ae771770SStanislav Sedov 112*ae771770SStanislav Sedov 113*ae771770SStanislav Sedov 114*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 2] 115*ae771770SStanislav Sedov 116*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 117*ae771770SStanislav Sedov 118*ae771770SStanislav Sedov 119*ae771770SStanislav Sedov5. Prohibited Output 120*ae771770SStanislav Sedov 121*ae771770SStanislav Sedov This profile specifies prohibiting using the following tables from 122*ae771770SStanislav Sedov [STRINGPREP]: 123*ae771770SStanislav Sedov 124*ae771770SStanislav Sedov Table C.1.2 125*ae771770SStanislav Sedov Table C.2.2 126*ae771770SStanislav Sedov Table C.3 127*ae771770SStanislav Sedov Table C.4 128*ae771770SStanislav Sedov Table C.5 129*ae771770SStanislav Sedov Table C.6 130*ae771770SStanislav Sedov Table C.7 131*ae771770SStanislav Sedov Table C.8 132*ae771770SStanislav Sedov Table C.9 133*ae771770SStanislav Sedov 134*ae771770SStanislav Sedov IMPORTANT NOTE: This profile MUST be used with the IDNA protocol. 135*ae771770SStanislav Sedov The IDNA protocol has additional prohibitions that are checked 136*ae771770SStanislav Sedov outside of this profile. 137*ae771770SStanislav Sedov 138*ae771770SStanislav Sedov6. Bidirectional characters 139*ae771770SStanislav Sedov 140*ae771770SStanislav Sedov This profile specifies checking bidirectional strings as described in 141*ae771770SStanislav Sedov [STRINGPREP] section 6. 142*ae771770SStanislav Sedov 143*ae771770SStanislav Sedov7. Unassigned Code Points in Internationalized Domain Names 144*ae771770SStanislav Sedov 145*ae771770SStanislav Sedov If the processing in [IDNA] specifies that a list of unassigned code 146*ae771770SStanislav Sedov points be used, the system uses table A.1 from [STRINGPREP] as its 147*ae771770SStanislav Sedov list of unassigned code points. 148*ae771770SStanislav Sedov 149*ae771770SStanislav Sedov8. References 150*ae771770SStanislav Sedov 151*ae771770SStanislav Sedov8.1 Normative References 152*ae771770SStanislav Sedov 153*ae771770SStanislav Sedov [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 154*ae771770SStanislav Sedov Requirement Levels", BCP 14, RFC 2119, March 1997. 155*ae771770SStanislav Sedov 156*ae771770SStanislav Sedov [STRINGPREP] Hoffman, P. and M. Blanchet, "Preparation of 157*ae771770SStanislav Sedov Internationalized Strings ("stringprep")", RFC 3454, 158*ae771770SStanislav Sedov December 2002. 159*ae771770SStanislav Sedov 160*ae771770SStanislav Sedov [IDNA] Faltstrom, P., Hoffman, P. and A. Costello, 161*ae771770SStanislav Sedov "Internationalizing Domain Names in Applications 162*ae771770SStanislav Sedov (IDNA)", RFC 3490, March 2003. 163*ae771770SStanislav Sedov 164*ae771770SStanislav Sedov 165*ae771770SStanislav Sedov 166*ae771770SStanislav Sedov 167*ae771770SStanislav Sedov 168*ae771770SStanislav Sedov 169*ae771770SStanislav Sedov 170*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 3] 171*ae771770SStanislav Sedov 172*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 173*ae771770SStanislav Sedov 174*ae771770SStanislav Sedov 175*ae771770SStanislav Sedov8.2 Informative references 176*ae771770SStanislav Sedov 177*ae771770SStanislav Sedov [STD13] Mockapetris, P., "Domain names - concepts and 178*ae771770SStanislav Sedov facilities", STD 13, RFC 1034, and "Domain names - 179*ae771770SStanislav Sedov implementation and specification", STD 13, RFC 1035, 180*ae771770SStanislav Sedov November 1987. 181*ae771770SStanislav Sedov 182*ae771770SStanislav Sedov9. Security Considerations 183*ae771770SStanislav Sedov 184*ae771770SStanislav Sedov The Unicode and ISO/IEC 10646 repertoires have many characters that 185*ae771770SStanislav Sedov look similar. In many cases, users of security protocols might do 186*ae771770SStanislav Sedov visual matching, such as when comparing the names of trusted third 187*ae771770SStanislav Sedov parties. Because it is impossible to map similar-looking characters 188*ae771770SStanislav Sedov without a great deal of context such as knowing the fonts used, 189*ae771770SStanislav Sedov stringprep does nothing to map similar-looking characters together 190*ae771770SStanislav Sedov nor to prohibit some characters because they look like others. 191*ae771770SStanislav Sedov 192*ae771770SStanislav Sedov Security on the Internet partly relies on the DNS. Thus, any change 193*ae771770SStanislav Sedov to the characteristics of the DNS can change the security of much of 194*ae771770SStanislav Sedov the Internet. 195*ae771770SStanislav Sedov 196*ae771770SStanislav Sedov Domain names are used by users to connect to Internet servers. The 197*ae771770SStanislav Sedov security of the Internet would be compromised if a user entering a 198*ae771770SStanislav Sedov single internationalized name could be connected to different servers 199*ae771770SStanislav Sedov based on different interpretations of the internationalized domain 200*ae771770SStanislav Sedov name. 201*ae771770SStanislav Sedov 202*ae771770SStanislav Sedov Current applications might assume that the characters allowed in 203*ae771770SStanislav Sedov domain names will always be the same as they are in [STD13]. This 204*ae771770SStanislav Sedov document vastly increases the number of characters available in 205*ae771770SStanislav Sedov domain names. Every program that uses "special" characters in 206*ae771770SStanislav Sedov conjunction with domain names may be vulnerable to attack based on 207*ae771770SStanislav Sedov the new characters allowed by this specification. 208*ae771770SStanislav Sedov 209*ae771770SStanislav Sedov 210*ae771770SStanislav Sedov 211*ae771770SStanislav Sedov 212*ae771770SStanislav Sedov 213*ae771770SStanislav Sedov 214*ae771770SStanislav Sedov 215*ae771770SStanislav Sedov 216*ae771770SStanislav Sedov 217*ae771770SStanislav Sedov 218*ae771770SStanislav Sedov 219*ae771770SStanislav Sedov 220*ae771770SStanislav Sedov 221*ae771770SStanislav Sedov 222*ae771770SStanislav Sedov 223*ae771770SStanislav Sedov 224*ae771770SStanislav Sedov 225*ae771770SStanislav Sedov 226*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 4] 227*ae771770SStanislav Sedov 228*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 229*ae771770SStanislav Sedov 230*ae771770SStanislav Sedov 231*ae771770SStanislav Sedov10. IANA Considerations 232*ae771770SStanislav Sedov 233*ae771770SStanislav Sedov This is a profile of stringprep. It has been registered by the IANA 234*ae771770SStanislav Sedov in the stringprep profile registry 235*ae771770SStanislav Sedov (www.iana.org/assignments/stringprep-profiles). 236*ae771770SStanislav Sedov 237*ae771770SStanislav Sedov Name of this profile: 238*ae771770SStanislav Sedov Nameprep 239*ae771770SStanislav Sedov 240*ae771770SStanislav Sedov RFC in which the profile is defined: 241*ae771770SStanislav Sedov This document. 242*ae771770SStanislav Sedov 243*ae771770SStanislav Sedov Indicator whether or not this is the newest version of the 244*ae771770SStanislav Sedov profile: 245*ae771770SStanislav Sedov This is the first version of Nameprep. 246*ae771770SStanislav Sedov 247*ae771770SStanislav Sedov11. Acknowledgements 248*ae771770SStanislav Sedov 249*ae771770SStanislav Sedov Many people from the IETF IDN Working Group and the Unicode Technical 250*ae771770SStanislav Sedov Committee contributed ideas that went into this document. 251*ae771770SStanislav Sedov 252*ae771770SStanislav Sedov The IDN Nameprep design team made many useful changes to the 253*ae771770SStanislav Sedov document. That team and its advisors include: 254*ae771770SStanislav Sedov 255*ae771770SStanislav Sedov Asmus Freytag 256*ae771770SStanislav Sedov Cathy Wissink 257*ae771770SStanislav Sedov Francois Yergeau 258*ae771770SStanislav Sedov James Seng 259*ae771770SStanislav Sedov Marc Blanchet 260*ae771770SStanislav Sedov Mark Davis 261*ae771770SStanislav Sedov Martin Duerst 262*ae771770SStanislav Sedov Patrik Faltstrom 263*ae771770SStanislav Sedov Paul Hoffman 264*ae771770SStanislav Sedov 265*ae771770SStanislav Sedov Additional significant improvements were proposed by: 266*ae771770SStanislav Sedov 267*ae771770SStanislav Sedov Jonathan Rosenne 268*ae771770SStanislav Sedov Kent Karlsson 269*ae771770SStanislav Sedov Scott Hollenbeck 270*ae771770SStanislav Sedov Dave Crocker 271*ae771770SStanislav Sedov Erik Nordmark 272*ae771770SStanislav Sedov Matitiahu Allouche 273*ae771770SStanislav Sedov 274*ae771770SStanislav Sedov 275*ae771770SStanislav Sedov 276*ae771770SStanislav Sedov 277*ae771770SStanislav Sedov 278*ae771770SStanislav Sedov 279*ae771770SStanislav Sedov 280*ae771770SStanislav Sedov 281*ae771770SStanislav Sedov 282*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 5] 283*ae771770SStanislav Sedov 284*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 285*ae771770SStanislav Sedov 286*ae771770SStanislav Sedov 287*ae771770SStanislav Sedov12. Authors' Addresses 288*ae771770SStanislav Sedov 289*ae771770SStanislav Sedov Paul Hoffman 290*ae771770SStanislav Sedov Internet Mail Consortium and VPN Consortium 291*ae771770SStanislav Sedov 127 Segre Place 292*ae771770SStanislav Sedov Santa Cruz, CA 95060 USA 293*ae771770SStanislav Sedov 294*ae771770SStanislav Sedov EMail: paul.hoffman@imc.org and paul.hoffman@vpnc.org 295*ae771770SStanislav Sedov 296*ae771770SStanislav Sedov 297*ae771770SStanislav Sedov Marc Blanchet 298*ae771770SStanislav Sedov Viagenie inc. 299*ae771770SStanislav Sedov 2875 boul. Laurier, bur. 300 300*ae771770SStanislav Sedov Ste-Foy, Quebec, Canada, G1V 2M2 301*ae771770SStanislav Sedov 302*ae771770SStanislav Sedov EMail: Marc.Blanchet@viagenie.qc.ca 303*ae771770SStanislav Sedov 304*ae771770SStanislav Sedov 305*ae771770SStanislav Sedov 306*ae771770SStanislav Sedov 307*ae771770SStanislav Sedov 308*ae771770SStanislav Sedov 309*ae771770SStanislav Sedov 310*ae771770SStanislav Sedov 311*ae771770SStanislav Sedov 312*ae771770SStanislav Sedov 313*ae771770SStanislav Sedov 314*ae771770SStanislav Sedov 315*ae771770SStanislav Sedov 316*ae771770SStanislav Sedov 317*ae771770SStanislav Sedov 318*ae771770SStanislav Sedov 319*ae771770SStanislav Sedov 320*ae771770SStanislav Sedov 321*ae771770SStanislav Sedov 322*ae771770SStanislav Sedov 323*ae771770SStanislav Sedov 324*ae771770SStanislav Sedov 325*ae771770SStanislav Sedov 326*ae771770SStanislav Sedov 327*ae771770SStanislav Sedov 328*ae771770SStanislav Sedov 329*ae771770SStanislav Sedov 330*ae771770SStanislav Sedov 331*ae771770SStanislav Sedov 332*ae771770SStanislav Sedov 333*ae771770SStanislav Sedov 334*ae771770SStanislav Sedov 335*ae771770SStanislav Sedov 336*ae771770SStanislav Sedov 337*ae771770SStanislav Sedov 338*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 6] 339*ae771770SStanislav Sedov 340*ae771770SStanislav SedovRFC 3491 IDN Nameprep March 2003 341*ae771770SStanislav Sedov 342*ae771770SStanislav Sedov 343*ae771770SStanislav Sedov13. Full Copyright Statement 344*ae771770SStanislav Sedov 345*ae771770SStanislav Sedov Copyright (C) The Internet Society (2003). All Rights Reserved. 346*ae771770SStanislav Sedov 347*ae771770SStanislav Sedov This document and translations of it may be copied and furnished to 348*ae771770SStanislav Sedov others, and derivative works that comment on or otherwise explain it 349*ae771770SStanislav Sedov or assist in its implementation may be prepared, copied, published 350*ae771770SStanislav Sedov and distributed, in whole or in part, without restriction of any 351*ae771770SStanislav Sedov kind, provided that the above copyright notice and this paragraph are 352*ae771770SStanislav Sedov included on all such copies and derivative works. However, this 353*ae771770SStanislav Sedov document itself may not be modified in any way, such as by removing 354*ae771770SStanislav Sedov the copyright notice or references to the Internet Society or other 355*ae771770SStanislav Sedov Internet organizations, except as needed for the purpose of 356*ae771770SStanislav Sedov developing Internet standards in which case the procedures for 357*ae771770SStanislav Sedov copyrights defined in the Internet Standards process must be 358*ae771770SStanislav Sedov followed, or as required to translate it into languages other than 359*ae771770SStanislav Sedov English. 360*ae771770SStanislav Sedov 361*ae771770SStanislav Sedov The limited permissions granted above are perpetual and will not be 362*ae771770SStanislav Sedov revoked by the Internet Society or its successors or assigns. 363*ae771770SStanislav Sedov 364*ae771770SStanislav Sedov This document and the information contained herein is provided on an 365*ae771770SStanislav Sedov "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 366*ae771770SStanislav Sedov TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 367*ae771770SStanislav Sedov BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 368*ae771770SStanislav Sedov HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 369*ae771770SStanislav Sedov MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 370*ae771770SStanislav Sedov 371*ae771770SStanislav SedovAcknowledgement 372*ae771770SStanislav Sedov 373*ae771770SStanislav Sedov Funding for the RFC Editor function is currently provided by the 374*ae771770SStanislav Sedov Internet Society. 375*ae771770SStanislav Sedov 376*ae771770SStanislav Sedov 377*ae771770SStanislav Sedov 378*ae771770SStanislav Sedov 379*ae771770SStanislav Sedov 380*ae771770SStanislav Sedov 381*ae771770SStanislav Sedov 382*ae771770SStanislav Sedov 383*ae771770SStanislav Sedov 384*ae771770SStanislav Sedov 385*ae771770SStanislav Sedov 386*ae771770SStanislav Sedov 387*ae771770SStanislav Sedov 388*ae771770SStanislav Sedov 389*ae771770SStanislav Sedov 390*ae771770SStanislav Sedov 391*ae771770SStanislav Sedov 392*ae771770SStanislav Sedov 393*ae771770SStanislav Sedov 394*ae771770SStanislav SedovHoffman & Blanchet Standards Track [Page 7] 395*ae771770SStanislav Sedov 396