xref: /freebsd/crypto/heimdal/lib/wind/rfc4518.txt (revision 6a068746777241722b2b32c5d0bc443a2a64d80b)
1*ae771770SStanislav Sedov
2*ae771770SStanislav Sedov
3*ae771770SStanislav Sedov
4*ae771770SStanislav Sedov
5*ae771770SStanislav Sedov
6*ae771770SStanislav Sedov
7*ae771770SStanislav SedovNetwork Working Group                                        K. Zeilenga
8*ae771770SStanislav SedovRequest for Comments: 4518                           OpenLDAP Foundation
9*ae771770SStanislav SedovCategory: Standards Track                                      June 2006
10*ae771770SStanislav Sedov
11*ae771770SStanislav Sedov
12*ae771770SStanislav Sedov             Lightweight Directory Access Protocol (LDAP):
13*ae771770SStanislav Sedov                  Internationalized String Preparation
14*ae771770SStanislav Sedov
15*ae771770SStanislav SedovStatus of This Memo
16*ae771770SStanislav Sedov
17*ae771770SStanislav Sedov   This document specifies an Internet standards track protocol for the
18*ae771770SStanislav Sedov   Internet community, and requests discussion and suggestions for
19*ae771770SStanislav Sedov   improvements.  Please refer to the current edition of the "Internet
20*ae771770SStanislav Sedov   Official Protocol Standards" (STD 1) for the standardization state
21*ae771770SStanislav Sedov   and status of this protocol.  Distribution of this memo is unlimited.
22*ae771770SStanislav Sedov
23*ae771770SStanislav SedovCopyright Notice
24*ae771770SStanislav Sedov
25*ae771770SStanislav Sedov   Copyright (C) The Internet Society (2006).
26*ae771770SStanislav Sedov
27*ae771770SStanislav SedovAbstract
28*ae771770SStanislav Sedov
29*ae771770SStanislav Sedov   The previous Lightweight Directory Access Protocol (LDAP) technical
30*ae771770SStanislav Sedov   specifications did not precisely define how character string matching
31*ae771770SStanislav Sedov   is to be performed.  This led to a number of usability and
32*ae771770SStanislav Sedov   interoperability problems.  This document defines string preparation
33*ae771770SStanislav Sedov   algorithms for character-based matching rules defined for use in
34*ae771770SStanislav Sedov   LDAP.
35*ae771770SStanislav Sedov
36*ae771770SStanislav Sedov1.  Introduction
37*ae771770SStanislav Sedov
38*ae771770SStanislav Sedov1.1.  Background
39*ae771770SStanislav Sedov
40*ae771770SStanislav Sedov   A Lightweight Directory Access Protocol (LDAP) [RFC4510] matching
41*ae771770SStanislav Sedov   rule [RFC4517] defines an algorithm for determining whether a
42*ae771770SStanislav Sedov   presented value matches an attribute value in accordance with the
43*ae771770SStanislav Sedov   criteria defined for the rule.  The proposition may be evaluated to
44*ae771770SStanislav Sedov   True, False, or Undefined.
45*ae771770SStanislav Sedov
46*ae771770SStanislav Sedov      True      - the attribute contains a matching value,
47*ae771770SStanislav Sedov
48*ae771770SStanislav Sedov      False     - the attribute contains no matching value,
49*ae771770SStanislav Sedov
50*ae771770SStanislav Sedov      Undefined - it cannot be determined whether the attribute contains
51*ae771770SStanislav Sedov                  a matching value.
52*ae771770SStanislav Sedov
53*ae771770SStanislav Sedov
54*ae771770SStanislav Sedov
55*ae771770SStanislav Sedov
56*ae771770SStanislav Sedov
57*ae771770SStanislav Sedov
58*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 1]
59*ae771770SStanislav Sedov
60*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
61*ae771770SStanislav Sedov
62*ae771770SStanislav Sedov
63*ae771770SStanislav Sedov   For instance, the caseIgnoreMatch matching rule may be used to
64*ae771770SStanislav Sedov   compare whether the commonName attribute contains a particular value
65*ae771770SStanislav Sedov   without regard for case and insignificant spaces.
66*ae771770SStanislav Sedov
67*ae771770SStanislav Sedov1.2.  X.500 String Matching Rules
68*ae771770SStanislav Sedov
69*ae771770SStanislav Sedov   "X.520: Selected attribute types" [X.520] provides (among other
70*ae771770SStanislav Sedov   things) value syntaxes and matching rules for comparing values
71*ae771770SStanislav Sedov   commonly used in the directory [X.500].  These specifications are
72*ae771770SStanislav Sedov   inadequate for strings composed of Unicode [Unicode] characters.
73*ae771770SStanislav Sedov
74*ae771770SStanislav Sedov   The caseIgnoreMatch matching rule [X.520], for example, is simply
75*ae771770SStanislav Sedov   defined as being a case-insensitive comparison where insignificant
76*ae771770SStanislav Sedov   spaces are ignored.  For printableString, there is only one space
77*ae771770SStanislav Sedov   character and case mapping is bijective, hence this definition is
78*ae771770SStanislav Sedov   sufficient.  However, for Unicode string types such as
79*ae771770SStanislav Sedov   universalString, this is not sufficient.  For example, a case-
80*ae771770SStanislav Sedov   insensitive matching implementation that folded lowercase characters
81*ae771770SStanislav Sedov   to uppercase would yield different results than an implementation
82*ae771770SStanislav Sedov   that used uppercase to lowercase folding.  Or one implementation may
83*ae771770SStanislav Sedov   view space as referring to only SPACE (U+0020), a second
84*ae771770SStanislav Sedov   implementation may view any character with the space separator (Zs)
85*ae771770SStanislav Sedov   property as a space, and another implementation may view any
86*ae771770SStanislav Sedov   character with the whitespace (WS) category as a space.
87*ae771770SStanislav Sedov
88*ae771770SStanislav Sedov   The lack of precise specification for character string matching has
89*ae771770SStanislav Sedov   led to significant interoperability problems.  When used in
90*ae771770SStanislav Sedov   certificate chain validation, security vulnerabilities can arise.  To
91*ae771770SStanislav Sedov   address these problems, this document defines precise algorithms for
92*ae771770SStanislav Sedov   preparing character strings for matching.
93*ae771770SStanislav Sedov
94*ae771770SStanislav Sedov1.3.  Relationship to "stringprep"
95*ae771770SStanislav Sedov
96*ae771770SStanislav Sedov   The character string preparation algorithms described in this
97*ae771770SStanislav Sedov   document are based upon the "stringprep" approach [RFC3454].  In
98*ae771770SStanislav Sedov   "stringprep", presented and stored values are first prepared for
99*ae771770SStanislav Sedov   comparison so that a character-by-character comparison yields the
100*ae771770SStanislav Sedov   "correct" result.
101*ae771770SStanislav Sedov
102*ae771770SStanislav Sedov   The approach used here is a refinement of the "stringprep" [RFC3454]
103*ae771770SStanislav Sedov   approach.  Each algorithm involves two additional preparation steps.
104*ae771770SStanislav Sedov
105*ae771770SStanislav Sedov   a) Prior to applying the Unicode string preparation steps outlined in
106*ae771770SStanislav Sedov      "stringprep", the string is transcoded to Unicode.
107*ae771770SStanislav Sedov
108*ae771770SStanislav Sedov   b) After applying the Unicode string preparation steps outlined in
109*ae771770SStanislav Sedov      "stringprep", the string is modified to appropriately handle
110*ae771770SStanislav Sedov      characters insignificant to the matching rule.
111*ae771770SStanislav Sedov
112*ae771770SStanislav Sedov
113*ae771770SStanislav Sedov
114*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 2]
115*ae771770SStanislav Sedov
116*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
117*ae771770SStanislav Sedov
118*ae771770SStanislav Sedov
119*ae771770SStanislav Sedov   Hence, preparation of character strings for X.500 [X.500] matching
120*ae771770SStanislav Sedov   [X.501] involves the following steps:
121*ae771770SStanislav Sedov
122*ae771770SStanislav Sedov      1) Transcode
123*ae771770SStanislav Sedov      2) Map
124*ae771770SStanislav Sedov      3) Normalize
125*ae771770SStanislav Sedov      4) Prohibit
126*ae771770SStanislav Sedov      5) Check Bidi (Bidirectional)
127*ae771770SStanislav Sedov      6) Insignificant Character Handling
128*ae771770SStanislav Sedov
129*ae771770SStanislav Sedov   These steps are described in Section 2.
130*ae771770SStanislav Sedov
131*ae771770SStanislav Sedov   It is noted that while various tables of Unicode characters included
132*ae771770SStanislav Sedov   or referenced by this specification are derived from Unicode
133*ae771770SStanislav Sedov   [Unicode] data, these tables are to be considered definitive for the
134*ae771770SStanislav Sedov   purpose of implementing this specification.
135*ae771770SStanislav Sedov
136*ae771770SStanislav Sedov1.4.  Relationship to the LDAP Technical Specification
137*ae771770SStanislav Sedov
138*ae771770SStanislav Sedov   This document is an integral part of the LDAP technical specification
139*ae771770SStanislav Sedov   [RFC4510], which obsoletes the previously defined LDAP technical
140*ae771770SStanislav Sedov   specification [RFC3377] in its entirety.
141*ae771770SStanislav Sedov
142*ae771770SStanislav Sedov   This document details new LDAP internationalized character string
143*ae771770SStanislav Sedov   preparation algorithms used by [RFC4517] and possible other technical
144*ae771770SStanislav Sedov   specifications defining LDAP syntaxes and/or matching rules.
145*ae771770SStanislav Sedov
146*ae771770SStanislav Sedov1.5.  Relationship to X.500
147*ae771770SStanislav Sedov
148*ae771770SStanislav Sedov   LDAP is defined [RFC4510] in X.500 terms as an X.500 access
149*ae771770SStanislav Sedov   mechanism.  As such, there is a strong desire for alignment between
150*ae771770SStanislav Sedov   LDAP and X.500 syntax and semantics.  The character string
151*ae771770SStanislav Sedov   preparation algorithms described in this document are based upon
152*ae771770SStanislav Sedov   "Internationalized String Matching Rules for X.500" [XMATCH] proposal
153*ae771770SStanislav Sedov   to ITU/ISO Joint Study Group 2.
154*ae771770SStanislav Sedov
155*ae771770SStanislav Sedov1.6.  Conventions and Terms
156*ae771770SStanislav Sedov
157*ae771770SStanislav Sedov   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
158*ae771770SStanislav Sedov   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
159*ae771770SStanislav Sedov   document are to be interpreted as described in BCP 14 [RFC2119].
160*ae771770SStanislav Sedov
161*ae771770SStanislav Sedov   Character names in this document use the notation for code points and
162*ae771770SStanislav Sedov   names from the Unicode Standard [Unicode].  For example, the letter
163*ae771770SStanislav Sedov   "a" may be represented as either <U+0061> or <LATIN SMALL LETTER A>.
164*ae771770SStanislav Sedov   In the lists of mappings and the prohibited characters, the "U+" is
165*ae771770SStanislav Sedov
166*ae771770SStanislav Sedov
167*ae771770SStanislav Sedov
168*ae771770SStanislav Sedov
169*ae771770SStanislav Sedov
170*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 3]
171*ae771770SStanislav Sedov
172*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
173*ae771770SStanislav Sedov
174*ae771770SStanislav Sedov
175*ae771770SStanislav Sedov   left off to make the lists easier to read.  The comments for
176*ae771770SStanislav Sedov   character ranges are shown in square brackets (such as "[CONTROL
177*ae771770SStanislav Sedov   CHARACTERS]") and do not come from the standard.
178*ae771770SStanislav Sedov
179*ae771770SStanislav Sedov   Note: a glossary of terms used in Unicode can be found in [Glossary].
180*ae771770SStanislav Sedov   Information on the Unicode character encoding model can be found in
181*ae771770SStanislav Sedov   [CharModel].
182*ae771770SStanislav Sedov
183*ae771770SStanislav Sedov   The term "combining mark", as used in this specification, refers to
184*ae771770SStanislav Sedov   any Unicode [Unicode] code point that has a mark property (Mn, Mc,
185*ae771770SStanislav Sedov   Me).  Appendix A provides a definitive list of combining marks.
186*ae771770SStanislav Sedov
187*ae771770SStanislav Sedov2.  String Preparation
188*ae771770SStanislav Sedov
189*ae771770SStanislav Sedov   The following six-step process SHALL be applied to each presented and
190*ae771770SStanislav Sedov   attribute value in preparation for character string matching rule
191*ae771770SStanislav Sedov   evaluation.
192*ae771770SStanislav Sedov
193*ae771770SStanislav Sedov      1) Transcode
194*ae771770SStanislav Sedov      2) Map
195*ae771770SStanislav Sedov      3) Normalize
196*ae771770SStanislav Sedov      4) Prohibit
197*ae771770SStanislav Sedov      5) Check bidi
198*ae771770SStanislav Sedov      6) Insignificant Character Handling
199*ae771770SStanislav Sedov
200*ae771770SStanislav Sedov   Failure in any step causes the assertion to evaluate to Undefined.
201*ae771770SStanislav Sedov
202*ae771770SStanislav Sedov   The character repertoire of this process is Unicode 3.2 [Unicode].
203*ae771770SStanislav Sedov
204*ae771770SStanislav Sedov   Note that this six-step process specification is intended to describe
205*ae771770SStanislav Sedov   expected matching behavior.  Implementations are free to use
206*ae771770SStanislav Sedov   alternative processes so long as the matching rule evaluation
207*ae771770SStanislav Sedov   behavior provided is consistent with the behavior described by this
208*ae771770SStanislav Sedov   specification.
209*ae771770SStanislav Sedov
210*ae771770SStanislav Sedov2.1.  Transcode
211*ae771770SStanislav Sedov
212*ae771770SStanislav Sedov   Each non-Unicode string value is transcoded to Unicode.
213*ae771770SStanislav Sedov
214*ae771770SStanislav Sedov   PrintableString [X.680] values are transcoded directly to Unicode.
215*ae771770SStanislav Sedov
216*ae771770SStanislav Sedov   UniversalString, UTF8String, and bmpString [X.680] values need not be
217*ae771770SStanislav Sedov   transcoded as they are Unicode-based strings (in the case of
218*ae771770SStanislav Sedov   bmpString, a subset of Unicode).
219*ae771770SStanislav Sedov
220*ae771770SStanislav Sedov   TeletexString [X.680] values are transcoded to Unicode.  As there is
221*ae771770SStanislav Sedov   no standard for mapping TeletexString values to Unicode, the mapping
222*ae771770SStanislav Sedov   is left a local matter.
223*ae771770SStanislav Sedov
224*ae771770SStanislav Sedov
225*ae771770SStanislav Sedov
226*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 4]
227*ae771770SStanislav Sedov
228*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
229*ae771770SStanislav Sedov
230*ae771770SStanislav Sedov
231*ae771770SStanislav Sedov   For these and other reasons, use of TeletexString is NOT RECOMMENDED.
232*ae771770SStanislav Sedov
233*ae771770SStanislav Sedov   The output is the transcoded string.
234*ae771770SStanislav Sedov
235*ae771770SStanislav Sedov2.2.  Map
236*ae771770SStanislav Sedov
237*ae771770SStanislav Sedov   SOFT HYPHEN (U+00AD) and MONGOLIAN TODO SOFT HYPHEN (U+1806) code
238*ae771770SStanislav Sedov   points are mapped to nothing.  COMBINING GRAPHEME JOINER (U+034F) and
239*ae771770SStanislav Sedov   VARIATION SELECTORs (U+180B-180D, FF00-FE0F) code points are also
240*ae771770SStanislav Sedov   mapped to nothing.  The OBJECT REPLACEMENT CHARACTER (U+FFFC) is
241*ae771770SStanislav Sedov   mapped to nothing.
242*ae771770SStanislav Sedov
243*ae771770SStanislav Sedov   CHARACTER TABULATION (U+0009), LINE FEED (LF) (U+000A), LINE
244*ae771770SStanislav Sedov   TABULATION (U+000B), FORM FEED (FF) (U+000C), CARRIAGE RETURN (CR)
245*ae771770SStanislav Sedov   (U+000D), and NEXT LINE (NEL) (U+0085) are mapped to SPACE (U+0020).
246*ae771770SStanislav Sedov
247*ae771770SStanislav Sedov   All other control code (e.g., Cc) points or code points with a
248*ae771770SStanislav Sedov   control function (e.g., Cf) are mapped to nothing.  The following is
249*ae771770SStanislav Sedov   a complete list of these code points: U+0000-0008, 000E-001F, 007F-
250*ae771770SStanislav Sedov   0084, 0086-009F, 06DD, 070F, 180E, 200C-200F, 202A-202E, 2060-2063,
251*ae771770SStanislav Sedov   206A-206F, FEFF, FFF9-FFFB, 1D173-1D17A, E0001, E0020-E007F.
252*ae771770SStanislav Sedov
253*ae771770SStanislav Sedov   ZERO WIDTH SPACE (U+200B) is mapped to nothing.  All other code
254*ae771770SStanislav Sedov   points with Separator (space, line, or paragraph) property (e.g., Zs,
255*ae771770SStanislav Sedov   Zl, or Zp) are mapped to SPACE (U+0020).  The following is a complete
256*ae771770SStanislav Sedov   list of these code points: U+0020, 00A0, 1680, 2000-200A, 2028-2029,
257*ae771770SStanislav Sedov   202F, 205F, 3000.
258*ae771770SStanislav Sedov
259*ae771770SStanislav Sedov   For case ignore, numeric, and stored prefix string matching rules,
260*ae771770SStanislav Sedov   characters are case folded per B.2 of [RFC3454].
261*ae771770SStanislav Sedov
262*ae771770SStanislav Sedov   The output is the mapped string.
263*ae771770SStanislav Sedov
264*ae771770SStanislav Sedov2.3.  Normalize
265*ae771770SStanislav Sedov
266*ae771770SStanislav Sedov   The input string is to be normalized to Unicode Form KC
267*ae771770SStanislav Sedov   (compatibility composed) as described in [UAX15].  The output is the
268*ae771770SStanislav Sedov   normalized string.
269*ae771770SStanislav Sedov
270*ae771770SStanislav Sedov2.4.  Prohibit
271*ae771770SStanislav Sedov
272*ae771770SStanislav Sedov   All Unassigned code points are prohibited.  Unassigned code points
273*ae771770SStanislav Sedov   are listed in Table A.1 of [RFC3454].
274*ae771770SStanislav Sedov
275*ae771770SStanislav Sedov   Characters that, per Section 5.8 of [RFC3454], change display
276*ae771770SStanislav Sedov   properties or are deprecated are prohibited.  These characters are
277*ae771770SStanislav Sedov   listed in Table C.8 of [RFC3454].
278*ae771770SStanislav Sedov
279*ae771770SStanislav Sedov
280*ae771770SStanislav Sedov
281*ae771770SStanislav Sedov
282*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 5]
283*ae771770SStanislav Sedov
284*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
285*ae771770SStanislav Sedov
286*ae771770SStanislav Sedov
287*ae771770SStanislav Sedov   Private Use code points are prohibited.  These characters are listed
288*ae771770SStanislav Sedov   in Table C.3 of [RFC3454].
289*ae771770SStanislav Sedov
290*ae771770SStanislav Sedov   All non-character code points are prohibited.  These code points are
291*ae771770SStanislav Sedov   listed in Table C.4 of [RFC3454].
292*ae771770SStanislav Sedov
293*ae771770SStanislav Sedov   Surrogate codes are prohibited.  These characters are listed in Table
294*ae771770SStanislav Sedov   C.5 of [RFC3454].
295*ae771770SStanislav Sedov
296*ae771770SStanislav Sedov   The REPLACEMENT CHARACTER (U+FFFD) code point is prohibited.
297*ae771770SStanislav Sedov
298*ae771770SStanislav Sedov   The step fails if the input string contains any prohibited code
299*ae771770SStanislav Sedov   point.  Otherwise, the output is the input string.
300*ae771770SStanislav Sedov
301*ae771770SStanislav Sedov2.5.  Check bidi
302*ae771770SStanislav Sedov
303*ae771770SStanislav Sedov   Bidirectional characters are ignored.
304*ae771770SStanislav Sedov
305*ae771770SStanislav Sedov2.6.  Insignificant Character Handling
306*ae771770SStanislav Sedov
307*ae771770SStanislav Sedov   In this step, the string is modified to ensure proper handling of
308*ae771770SStanislav Sedov   characters insignificant to the matching rule.  This modification
309*ae771770SStanislav Sedov   differs from matching rule to matching rule.
310*ae771770SStanislav Sedov
311*ae771770SStanislav Sedov   Section 2.6.1 applies to case ignore and exact string matching.
312*ae771770SStanislav Sedov   Section 2.6.2 applies to numericString matching.
313*ae771770SStanislav Sedov   Section 2.6.3 applies to telephoneNumber matching.
314*ae771770SStanislav Sedov
315*ae771770SStanislav Sedov2.6.1.  Insignificant Space Handling
316*ae771770SStanislav Sedov
317*ae771770SStanislav Sedov   For the purposes of this section, a space is defined to be the SPACE
318*ae771770SStanislav Sedov   (U+0020) code point followed by no combining marks.
319*ae771770SStanislav Sedov
320*ae771770SStanislav Sedov       NOTE - The previous steps ensure that the string cannot contain
321*ae771770SStanislav Sedov              any code points in the separator class, other than SPACE
322*ae771770SStanislav Sedov              (U+0020).
323*ae771770SStanislav Sedov
324*ae771770SStanislav Sedov   For input strings that are attribute values or non-substring
325*ae771770SStanislav Sedov   assertion values:  If the input string contains no non-space
326*ae771770SStanislav Sedov   character, then the output is exactly two SPACEs.  Otherwise (the
327*ae771770SStanislav Sedov   input string contains at least one non-space character), the string
328*ae771770SStanislav Sedov   is modified such that the string starts with exactly one space
329*ae771770SStanislav Sedov   character, ends with exactly one SPACE character, and any inner
330*ae771770SStanislav Sedov   (non-empty) sequence of space characters is replaced with exactly two
331*ae771770SStanislav Sedov   SPACE characters.  For instance, the input strings
332*ae771770SStanislav Sedov   "foo<SPACE>bar<SPACE><SPACE>", result in the output
333*ae771770SStanislav Sedov   "<SPACE>foo<SPACE><SPACE>bar<SPACE>".
334*ae771770SStanislav Sedov
335*ae771770SStanislav Sedov
336*ae771770SStanislav Sedov
337*ae771770SStanislav Sedov
338*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 6]
339*ae771770SStanislav Sedov
340*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
341*ae771770SStanislav Sedov
342*ae771770SStanislav Sedov
343*ae771770SStanislav Sedov   For input strings that are substring assertion values: If the string
344*ae771770SStanislav Sedov   being prepared contains no non-space characters, then the output
345*ae771770SStanislav Sedov   string is exactly one SPACE.  Otherwise, the following steps are
346*ae771770SStanislav Sedov   taken:
347*ae771770SStanislav Sedov
348*ae771770SStanislav Sedov   -  If the input string is an initial substring, it is modified to
349*ae771770SStanislav Sedov      start with exactly one SPACE character;
350*ae771770SStanislav Sedov
351*ae771770SStanislav Sedov   -  If the input string is an initial or an any substring that ends in
352*ae771770SStanislav Sedov      one or more space characters, it is modified to end with exactly
353*ae771770SStanislav Sedov      one SPACE character;
354*ae771770SStanislav Sedov
355*ae771770SStanislav Sedov   -  If the input string is an any or a final substring that starts in
356*ae771770SStanislav Sedov      one or more space characters, it is modified to start with exactly
357*ae771770SStanislav Sedov      one SPACE character; and
358*ae771770SStanislav Sedov
359*ae771770SStanislav Sedov   -  If the input string is a final substring, it is modified to end
360*ae771770SStanislav Sedov      with exactly one SPACE character.
361*ae771770SStanislav Sedov
362*ae771770SStanislav Sedov   For instance, for the input string "foo<SPACE>bar<SPACE><SPACE>" as
363*ae771770SStanislav Sedov   an initial substring, the output would be
364*ae771770SStanislav Sedov   "<SPACE>foo<SPACE><SPACE>bar<SPACE>".  As an any or final substring,
365*ae771770SStanislav Sedov   the same input would result in "foo<SPACE>bar<SPACE>".
366*ae771770SStanislav Sedov
367*ae771770SStanislav Sedov   Appendix B discusses the rationale for the behavior.
368*ae771770SStanislav Sedov
369*ae771770SStanislav Sedov2.6.2.  numericString Insignificant Character Handling
370*ae771770SStanislav Sedov
371*ae771770SStanislav Sedov   For the purposes of this section, a space is defined to be the SPACE
372*ae771770SStanislav Sedov   (U+0020) code point followed by no combining marks.
373*ae771770SStanislav Sedov
374*ae771770SStanislav Sedov   All spaces are regarded as insignificant and are to be removed.
375*ae771770SStanislav Sedov
376*ae771770SStanislav Sedov   For example, removal of spaces from the Form KC string:
377*ae771770SStanislav Sedov       "<SPACE><SPACE>123<SPACE><SPACE>456<SPACE><SPACE>"
378*ae771770SStanislav Sedov   would result in the output string:
379*ae771770SStanislav Sedov       "123456"
380*ae771770SStanislav Sedov   and the Form KC string:
381*ae771770SStanislav Sedov       "<SPACE><SPACE><SPACE>"
382*ae771770SStanislav Sedov   would result in the output string:
383*ae771770SStanislav Sedov       "" (an empty string).
384*ae771770SStanislav Sedov
385*ae771770SStanislav Sedov2.6.3.  telephoneNumber Insignificant Character Handling
386*ae771770SStanislav Sedov
387*ae771770SStanislav Sedov   For the purposes of this section, a hyphen is defined to be a
388*ae771770SStanislav Sedov   HYPHEN-MINUS (U+002D), ARMENIAN HYPHEN (U+058A), HYPHEN (U+2010),
389*ae771770SStanislav Sedov   NON-BREAKING HYPHEN (U+2011), MINUS SIGN (U+2212), SMALL HYPHEN-MINUS
390*ae771770SStanislav Sedov   (U+FE63), or FULLWIDTH HYPHEN-MINUS (U+FF0D) code point followed by
391*ae771770SStanislav Sedov
392*ae771770SStanislav Sedov
393*ae771770SStanislav Sedov
394*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 7]
395*ae771770SStanislav Sedov
396*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
397*ae771770SStanislav Sedov
398*ae771770SStanislav Sedov
399*ae771770SStanislav Sedov   no combining marks and a space is defined to be the SPACE (U+0020)
400*ae771770SStanislav Sedov   code point followed by no combining marks.
401*ae771770SStanislav Sedov
402*ae771770SStanislav Sedov   All hyphens and spaces are considered insignificant and are to be
403*ae771770SStanislav Sedov   removed.
404*ae771770SStanislav Sedov
405*ae771770SStanislav Sedov   For example, removal of hyphens and spaces from the Form KC string:
406*ae771770SStanislav Sedov       "<SPACE><HYPHEN>123<SPACE><SPACE>456<SPACE><HYPHEN>"
407*ae771770SStanislav Sedov   would result in the output string:
408*ae771770SStanislav Sedov       "123456"
409*ae771770SStanislav Sedov   and the Form KC string:
410*ae771770SStanislav Sedov       "<HYPHEN><HYPHEN><HYPHEN>"
411*ae771770SStanislav Sedov   would result in the (empty) output string:
412*ae771770SStanislav Sedov       "".
413*ae771770SStanislav Sedov
414*ae771770SStanislav Sedov3.  Security Considerations
415*ae771770SStanislav Sedov
416*ae771770SStanislav Sedov   "Preparation of Internationalized Strings ("stringprep")" [RFC3454]
417*ae771770SStanislav Sedov   security considerations generally apply to the algorithms described
418*ae771770SStanislav Sedov   here.
419*ae771770SStanislav Sedov
420*ae771770SStanislav Sedov4.  Acknowledgements
421*ae771770SStanislav Sedov
422*ae771770SStanislav Sedov   The approach used in this document is based upon design principles
423*ae771770SStanislav Sedov   and algorithms described in "Preparation of Internationalized Strings
424*ae771770SStanislav Sedov   ('stringprep')" [RFC3454] by Paul Hoffman and Marc Blanchet.  Some
425*ae771770SStanislav Sedov   additional guidance was drawn from Unicode Technical Standards,
426*ae771770SStanislav Sedov   Technical Reports, and Notes.
427*ae771770SStanislav Sedov
428*ae771770SStanislav Sedov   This document is a product of the IETF LDAP Revision (LDAPBIS)
429*ae771770SStanislav Sedov   Working Group.
430*ae771770SStanislav Sedov
431*ae771770SStanislav Sedov5.  References
432*ae771770SStanislav Sedov
433*ae771770SStanislav Sedov5.1.  Normative References
434*ae771770SStanislav Sedov
435*ae771770SStanislav Sedov   [RFC2119]     Bradner, S., "Key words for use in RFCs to Indicate
436*ae771770SStanislav Sedov                 Requirement Levels", BCP 14, RFC 2119, March 1997.
437*ae771770SStanislav Sedov
438*ae771770SStanislav Sedov   [RFC3454]     Hoffman, P. and M. Blanchet, "Preparation of
439*ae771770SStanislav Sedov                 Internationalized Strings ("stringprep")", RFC 3454,
440*ae771770SStanislav Sedov                 December 2002.
441*ae771770SStanislav Sedov
442*ae771770SStanislav Sedov   [RFC4510]     Zeilenga, K., "Lightweight Directory Access Protocol
443*ae771770SStanislav Sedov                 (LDAP): Technical Specification Road Map", RFC 4510,
444*ae771770SStanislav Sedov                 June 2006.
445*ae771770SStanislav Sedov
446*ae771770SStanislav Sedov
447*ae771770SStanislav Sedov
448*ae771770SStanislav Sedov
449*ae771770SStanislav Sedov
450*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 8]
451*ae771770SStanislav Sedov
452*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
453*ae771770SStanislav Sedov
454*ae771770SStanislav Sedov
455*ae771770SStanislav Sedov   [RFC4517]     Legg, S., Ed., "Lightweight Directory Access Protocol
456*ae771770SStanislav Sedov                 (LDAP): Syntaxes and Matching Rules", RFC 4517, June
457*ae771770SStanislav Sedov                 2006.
458*ae771770SStanislav Sedov
459*ae771770SStanislav Sedov   [Unicode]     The Unicode Consortium, "The Unicode Standard, Version
460*ae771770SStanislav Sedov                 3.2.0" is defined by "The Unicode Standard, Version
461*ae771770SStanislav Sedov                 3.0" (Reading, MA, Addison-Wesley, 2000.  ISBN 0-201-
462*ae771770SStanislav Sedov                 61633-5), as amended by the "Unicode Standard Annex
463*ae771770SStanislav Sedov                 #27: Unicode 3.1"
464*ae771770SStanislav Sedov                 (http://www.unicode.org/reports/tr27/) and by the
465*ae771770SStanislav Sedov                 "Unicode Standard Annex #28: Unicode 3.2"
466*ae771770SStanislav Sedov                 (http://www.unicode.org/reports/tr28/).
467*ae771770SStanislav Sedov
468*ae771770SStanislav Sedov   [UAX15]       Davis, M. and M. Duerst, "Unicode Standard Annex #15:
469*ae771770SStanislav Sedov                 Unicode Normalization Forms, Version 3.2.0".
470*ae771770SStanislav Sedov                 <http://www.unicode.org/unicode/reports/tr15/tr15-
471*ae771770SStanislav Sedov                 22.html>, March 2002.
472*ae771770SStanislav Sedov
473*ae771770SStanislav Sedov   [X.680]       International Telecommunication Union -
474*ae771770SStanislav Sedov                 Telecommunication Standardization Sector, "Abstract
475*ae771770SStanislav Sedov                 Syntax Notation One (ASN.1) - Specification of Basic
476*ae771770SStanislav Sedov                 Notation", X.680(2002) (also ISO/IEC 8824-1:2002).
477*ae771770SStanislav Sedov
478*ae771770SStanislav Sedov5.2.  Informative References
479*ae771770SStanislav Sedov
480*ae771770SStanislav Sedov   [X.500]       International Telecommunication Union -
481*ae771770SStanislav Sedov                 Telecommunication Standardization Sector, "The
482*ae771770SStanislav Sedov                 Directory -- Overview of concepts, models and
483*ae771770SStanislav Sedov                 services," X.500(1993) (also ISO/IEC 9594-1:1994).
484*ae771770SStanislav Sedov
485*ae771770SStanislav Sedov   [X.501]       International Telecommunication Union -
486*ae771770SStanislav Sedov                 Telecommunication Standardization Sector, "The
487*ae771770SStanislav Sedov                 Directory -- Models," X.501(1993) (also ISO/IEC 9594-
488*ae771770SStanislav Sedov                 2:1994).
489*ae771770SStanislav Sedov
490*ae771770SStanislav Sedov   [X.520]       International Telecommunication Union -
491*ae771770SStanislav Sedov                 Telecommunication Standardization Sector, "The
492*ae771770SStanislav Sedov                 Directory: Selected Attribute Types", X.520(1993) (also
493*ae771770SStanislav Sedov                 ISO/IEC 9594-6:1994).
494*ae771770SStanislav Sedov
495*ae771770SStanislav Sedov   [Glossary]    The Unicode Consortium, "Unicode Glossary",
496*ae771770SStanislav Sedov                 <http://www.unicode.org/glossary/>.
497*ae771770SStanislav Sedov
498*ae771770SStanislav Sedov   [CharModel]   Whistler, K. and M. Davis, "Unicode Technical Report
499*ae771770SStanislav Sedov                 #17, Character Encoding Model", UTR17,
500*ae771770SStanislav Sedov                 <http://www.unicode.org/unicode/reports/tr17/>, August
501*ae771770SStanislav Sedov                 2000.
502*ae771770SStanislav Sedov
503*ae771770SStanislav Sedov
504*ae771770SStanislav Sedov
505*ae771770SStanislav Sedov
506*ae771770SStanislav SedovZeilenga                    Standards Track                     [Page 9]
507*ae771770SStanislav Sedov
508*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
509*ae771770SStanislav Sedov
510*ae771770SStanislav Sedov
511*ae771770SStanislav Sedov   [RFC3377]     Hodges, J. and R. Morgan, "Lightweight Directory Access
512*ae771770SStanislav Sedov                 Protocol (v3): Technical Specification", RFC 3377,
513*ae771770SStanislav Sedov                 September 2002.
514*ae771770SStanislav Sedov
515*ae771770SStanislav Sedov   [RFC4515]     Smith, M., Ed. and T. Howes, "Lightweight Directory
516*ae771770SStanislav Sedov                 Access Protocol (LDAP): String Representation of Search
517*ae771770SStanislav Sedov                 Filters", RFC 4515, June 2006.
518*ae771770SStanislav Sedov
519*ae771770SStanislav Sedov   [XMATCH]      Zeilenga, K., "Internationalized String Matching Rules
520*ae771770SStanislav Sedov                 for X.500", Work in Progress.
521*ae771770SStanislav Sedov
522*ae771770SStanislav Sedov
523*ae771770SStanislav Sedov
524*ae771770SStanislav Sedov
525*ae771770SStanislav Sedov
526*ae771770SStanislav Sedov
527*ae771770SStanislav Sedov
528*ae771770SStanislav Sedov
529*ae771770SStanislav Sedov
530*ae771770SStanislav Sedov
531*ae771770SStanislav Sedov
532*ae771770SStanislav Sedov
533*ae771770SStanislav Sedov
534*ae771770SStanislav Sedov
535*ae771770SStanislav Sedov
536*ae771770SStanislav Sedov
537*ae771770SStanislav Sedov
538*ae771770SStanislav Sedov
539*ae771770SStanislav Sedov
540*ae771770SStanislav Sedov
541*ae771770SStanislav Sedov
542*ae771770SStanislav Sedov
543*ae771770SStanislav Sedov
544*ae771770SStanislav Sedov
545*ae771770SStanislav Sedov
546*ae771770SStanislav Sedov
547*ae771770SStanislav Sedov
548*ae771770SStanislav Sedov
549*ae771770SStanislav Sedov
550*ae771770SStanislav Sedov
551*ae771770SStanislav Sedov
552*ae771770SStanislav Sedov
553*ae771770SStanislav Sedov
554*ae771770SStanislav Sedov
555*ae771770SStanislav Sedov
556*ae771770SStanislav Sedov
557*ae771770SStanislav Sedov
558*ae771770SStanislav Sedov
559*ae771770SStanislav Sedov
560*ae771770SStanislav Sedov
561*ae771770SStanislav Sedov
562*ae771770SStanislav SedovZeilenga                    Standards Track                    [Page 10]
563*ae771770SStanislav Sedov
564*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
565*ae771770SStanislav Sedov
566*ae771770SStanislav Sedov
567*ae771770SStanislav SedovAppendix A.  Combining Marks
568*ae771770SStanislav Sedov
569*ae771770SStanislav Sedov   This appendix is normative.
570*ae771770SStanislav Sedov
571*ae771770SStanislav Sedov   This table was derived from Unicode [Unicode] data files; it lists
572*ae771770SStanislav Sedov   all code points with the Mn, Mc, or Me properties.  This table is to
573*ae771770SStanislav Sedov   be considered definitive for the purposes of implementation of this
574*ae771770SStanislav Sedov   specification.
575*ae771770SStanislav Sedov
576*ae771770SStanislav Sedov         0300-034F 0360-036F 0483-0486 0488-0489 0591-05A1
577*ae771770SStanislav Sedov         05A3-05B9 05BB-05BC 05BF 05C1-05C2 05C4 064B-0655 0670
578*ae771770SStanislav Sedov         06D6-06DC 06DE-06E4 06E7-06E8 06EA-06ED 0711 0730-074A
579*ae771770SStanislav Sedov         07A6-07B0 0901-0903 093C 093E-094F 0951-0954 0962-0963
580*ae771770SStanislav Sedov         0981-0983 09BC 09BE-09C4 09C7-09C8 09CB-09CD 09D7
581*ae771770SStanislav Sedov         09E2-09E3 0A02 0A3C 0A3E-0A42 0A47-0A48 0A4B-0A4D
582*ae771770SStanislav Sedov         0A70-0A71 0A81-0A83 0ABC 0ABE-0AC5 0AC7-0AC9 0ACB-0ACD
583*ae771770SStanislav Sedov         0B01-0B03 0B3C 0B3E-0B43 0B47-0B48 0B4B-0B4D 0B56-0B57
584*ae771770SStanislav Sedov         0B82 0BBE-0BC2 0BC6-0BC8 0BCA-0BCD 0BD7 0C01-0C03
585*ae771770SStanislav Sedov         0C3E-0C44 0C46-0C48 0C4A-0C4D 0C55-0C56 0C82-0C83
586*ae771770SStanislav Sedov         0CBE-0CC4 0CC6-0CC8 0CCA-0CCD 0CD5-0CD6 0D02-0D03
587*ae771770SStanislav Sedov         0D3E-0D43 0D46-0D48 0D4A-0D4D 0D57 0D82-0D83 0DCA
588*ae771770SStanislav Sedov         0DCF-0DD4 0DD6 0DD8-0DDF 0DF2-0DF3 0E31 0E34-0E3A
589*ae771770SStanislav Sedov         0E47-0E4E 0EB1 0EB4-0EB9 0EBB-0EBC 0EC8-0ECD 0F18-0F19
590*ae771770SStanislav Sedov         0F35 0F37 0F39 0F3E-0F3F 0F71-0F84 0F86-0F87 0F90-0F97
591*ae771770SStanislav Sedov         0F99-0FBC 0FC6 102C-1032 1036-1039 1056-1059 1712-1714
592*ae771770SStanislav Sedov         1732-1734 1752-1753 1772-1773 17B4-17D3 180B-180D 18A9
593*ae771770SStanislav Sedov         20D0-20EA 302A-302F 3099-309A FB1E FE00-FE0F FE20-FE23
594*ae771770SStanislav Sedov         1D165-1D169 1D16D-1D172 1D17B-1D182 1D185-1D18B
595*ae771770SStanislav Sedov         1D1AA-1D1AD
596*ae771770SStanislav Sedov
597*ae771770SStanislav SedovAppendix B.  Substrings Matching
598*ae771770SStanislav Sedov
599*ae771770SStanislav Sedov   This appendix is non-normative.
600*ae771770SStanislav Sedov
601*ae771770SStanislav Sedov   In the absence of substrings matching, the insignificant space
602*ae771770SStanislav Sedov   handling for case ignore/exact matching could be simplified.
603*ae771770SStanislav Sedov   Specifically, the handling could be to require that all sequences of
604*ae771770SStanislav Sedov   one or more spaces be replaced with one space and, if the string
605*ae771770SStanislav Sedov   contains non-space characters, removal of all leading spaces and
606*ae771770SStanislav Sedov   trailing spaces.
607*ae771770SStanislav Sedov
608*ae771770SStanislav Sedov   In the presence of substrings matching, this simplified space
609*ae771770SStanislav Sedov   handling would lead to unexpected and undesirable matching behavior.
610*ae771770SStanislav Sedov   For instance:
611*ae771770SStanislav Sedov
612*ae771770SStanislav Sedov   1) (CN=foo\20*\20bar) would match the CN value "foobar";
613*ae771770SStanislav Sedov
614*ae771770SStanislav Sedov
615*ae771770SStanislav Sedov
616*ae771770SStanislav Sedov
617*ae771770SStanislav Sedov
618*ae771770SStanislav SedovZeilenga                    Standards Track                    [Page 11]
619*ae771770SStanislav Sedov
620*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
621*ae771770SStanislav Sedov
622*ae771770SStanislav Sedov
623*ae771770SStanislav Sedov   2) (CN=*\20foobar\20*) would match "foobar", but
624*ae771770SStanislav Sedov      (CN=*\20*foobar*\20*) would not.
625*ae771770SStanislav Sedov
626*ae771770SStanislav Sedov   Note to readers not familiar with LDAP substrings matching: the LDAP
627*ae771770SStanislav Sedov   filter [RFC4515] assertion (CN=A*B*C) says to "match any value (of
628*ae771770SStanislav Sedov   the attribute CN) that begins with A, contains B after A, ends with C
629*ae771770SStanislav Sedov   where C is also after B."
630*ae771770SStanislav Sedov
631*ae771770SStanislav Sedov   The first case illustrates that this simplified space handling would
632*ae771770SStanislav Sedov   cause leading and trailing spaces in substrings of the string to be
633*ae771770SStanislav Sedov   regarded as insignificant.  However, only leading and trailing (as
634*ae771770SStanislav Sedov   well as multiple consecutive spaces) of the string (as a whole) are
635*ae771770SStanislav Sedov   insignificant.
636*ae771770SStanislav Sedov
637*ae771770SStanislav Sedov   The second case illustrates that this simplified space handling would
638*ae771770SStanislav Sedov   cause sub-partitioning failures.  That is, if a prepared any
639*ae771770SStanislav Sedov   substring matches a partition of the attribute value, then an
640*ae771770SStanislav Sedov   assertion constructed by subdividing that substring into multiple
641*ae771770SStanislav Sedov   substrings should also match.
642*ae771770SStanislav Sedov
643*ae771770SStanislav Sedov   In designing an appropriate approach for space handling for
644*ae771770SStanislav Sedov   substrings matching, one must study key aspects of X.500 case
645*ae771770SStanislav Sedov   exact/ignore matching.  X.520 [X.520] says:
646*ae771770SStanislav Sedov
647*ae771770SStanislav Sedov      The [substrings] rule returns TRUE if there is a partitioning of
648*ae771770SStanislav Sedov      the attribute value (into portions) such that:
649*ae771770SStanislav Sedov
650*ae771770SStanislav Sedov         -  the specified substrings (initial, any, final) match
651*ae771770SStanislav Sedov            different portions of the value in the order of the strings
652*ae771770SStanislav Sedov            sequence;
653*ae771770SStanislav Sedov
654*ae771770SStanislav Sedov         -  initial, if present, matches the first portion of the value;
655*ae771770SStanislav Sedov
656*ae771770SStanislav Sedov         -  final, if present, matches the last portion of the value;
657*ae771770SStanislav Sedov
658*ae771770SStanislav Sedov         -  any, if present, matches some arbitrary portion of the
659*ae771770SStanislav Sedov            value.
660*ae771770SStanislav Sedov
661*ae771770SStanislav Sedov   That is, the substrings assertion (CN=foo\20*\20bar) matches the
662*ae771770SStanislav Sedov   attribute value "foo<SPACE><SPACE>bar" as the value can be
663*ae771770SStanislav Sedov   partitioned into the portions "foo<SPACE>" and "<SPACE>bar" meeting
664*ae771770SStanislav Sedov   the above requirements.
665*ae771770SStanislav Sedov
666*ae771770SStanislav Sedov
667*ae771770SStanislav Sedov
668*ae771770SStanislav Sedov
669*ae771770SStanislav Sedov
670*ae771770SStanislav Sedov
671*ae771770SStanislav Sedov
672*ae771770SStanislav Sedov
673*ae771770SStanislav Sedov
674*ae771770SStanislav SedovZeilenga                    Standards Track                    [Page 12]
675*ae771770SStanislav Sedov
676*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
677*ae771770SStanislav Sedov
678*ae771770SStanislav Sedov
679*ae771770SStanislav Sedov   X.520 also says:
680*ae771770SStanislav Sedov
681*ae771770SStanislav Sedov      [T]he following spaces are regarded as not significant:
682*ae771770SStanislav Sedov
683*ae771770SStanislav Sedov         -  leading spaces (i.e., those preceding the first character
684*ae771770SStanislav Sedov            that is not a space);
685*ae771770SStanislav Sedov
686*ae771770SStanislav Sedov         -  trailing spaces (i.e., those following the last character
687*ae771770SStanislav Sedov            that is not a space);
688*ae771770SStanislav Sedov
689*ae771770SStanislav Sedov         -  multiple consecutive spaces (these are taken as equivalent
690*ae771770SStanislav Sedov            to a single space character).
691*ae771770SStanislav Sedov
692*ae771770SStanislav Sedov   This statement applies to the assertion values and attribute values
693*ae771770SStanislav Sedov   as whole strings, and not individually to substrings of an assertion
694*ae771770SStanislav Sedov   value.  In particular, the statements should be taken to mean that if
695*ae771770SStanislav Sedov   an assertion value and attribute value match without any
696*ae771770SStanislav Sedov   consideration to insignificant characters, then that assertion value
697*ae771770SStanislav Sedov   should also match any attribute value that differs only by inclusion
698*ae771770SStanislav Sedov   nor removal of insignificant characters.
699*ae771770SStanislav Sedov
700*ae771770SStanislav Sedov   Hence the assertion (CN=foo\20*\20bar) matches
701*ae771770SStanislav Sedov   "foo<SPACE><SPACE><SPACE>bar" and "foo<SPACE>bar" as these values
702*ae771770SStanislav Sedov   only differ from "foo<SPACE><SPACE>bar" by the inclusion or removal
703*ae771770SStanislav Sedov   of insignificant spaces.
704*ae771770SStanislav Sedov
705*ae771770SStanislav Sedov   Astute readers of this text will also note that there are special
706*ae771770SStanislav Sedov   cases where the specified space handling does not ignore spaces that
707*ae771770SStanislav Sedov   could be considered insignificant.  For instance, the assertion
708*ae771770SStanislav Sedov   (CN=\20*\20*\20) does not match "<SPACE><SPACE><SPACE>"
709*ae771770SStanislav Sedov   (insignificant spaces present in value) or " " (insignificant spaces
710*ae771770SStanislav Sedov   not present in value).  However, as these cases have no practical
711*ae771770SStanislav Sedov   application that cannot be met by simple assertions, e.g., (cn=\20),
712*ae771770SStanislav Sedov   and this minor anomaly can only be fully addressed by a preparation
713*ae771770SStanislav Sedov   algorithm to be used in conjunction with character-by-character
714*ae771770SStanislav Sedov   partitioning and matching, the anomaly is considered acceptable.
715*ae771770SStanislav Sedov
716*ae771770SStanislav SedovAuthor's Address
717*ae771770SStanislav Sedov
718*ae771770SStanislav Sedov   Kurt D. Zeilenga
719*ae771770SStanislav Sedov   OpenLDAP Foundation
720*ae771770SStanislav Sedov
721*ae771770SStanislav Sedov   EMail: Kurt@OpenLDAP.org
722*ae771770SStanislav Sedov
723*ae771770SStanislav Sedov
724*ae771770SStanislav Sedov
725*ae771770SStanislav Sedov
726*ae771770SStanislav Sedov
727*ae771770SStanislav Sedov
728*ae771770SStanislav Sedov
729*ae771770SStanislav Sedov
730*ae771770SStanislav SedovZeilenga                    Standards Track                    [Page 13]
731*ae771770SStanislav Sedov
732*ae771770SStanislav SedovRFC 4518       LDAP: Internationalized String Preparation      June 2006
733*ae771770SStanislav Sedov
734*ae771770SStanislav Sedov
735*ae771770SStanislav SedovFull Copyright Statement
736*ae771770SStanislav Sedov
737*ae771770SStanislav Sedov   Copyright (C) The Internet Society (2006).
738*ae771770SStanislav Sedov
739*ae771770SStanislav Sedov   This document is subject to the rights, licenses and restrictions
740*ae771770SStanislav Sedov   contained in BCP 78, and except as set forth therein, the authors
741*ae771770SStanislav Sedov   retain all their rights.
742*ae771770SStanislav Sedov
743*ae771770SStanislav Sedov   This document and the information contained herein are provided on an
744*ae771770SStanislav Sedov   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
745*ae771770SStanislav Sedov   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
746*ae771770SStanislav Sedov   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
747*ae771770SStanislav Sedov   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
748*ae771770SStanislav Sedov   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
749*ae771770SStanislav Sedov   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
750*ae771770SStanislav Sedov
751*ae771770SStanislav SedovIntellectual Property
752*ae771770SStanislav Sedov
753*ae771770SStanislav Sedov   The IETF takes no position regarding the validity or scope of any
754*ae771770SStanislav Sedov   Intellectual Property Rights or other rights that might be claimed to
755*ae771770SStanislav Sedov   pertain to the implementation or use of the technology described in
756*ae771770SStanislav Sedov   this document or the extent to which any license under such rights
757*ae771770SStanislav Sedov   might or might not be available; nor does it represent that it has
758*ae771770SStanislav Sedov   made any independent effort to identify any such rights.  Information
759*ae771770SStanislav Sedov   on the procedures with respect to rights in RFC documents can be
760*ae771770SStanislav Sedov   found in BCP 78 and BCP 79.
761*ae771770SStanislav Sedov
762*ae771770SStanislav Sedov   Copies of IPR disclosures made to the IETF Secretariat and any
763*ae771770SStanislav Sedov   assurances of licenses to be made available, or the result of an
764*ae771770SStanislav Sedov   attempt made to obtain a general license or permission for the use of
765*ae771770SStanislav Sedov   such proprietary rights by implementers or users of this
766*ae771770SStanislav Sedov   specification can be obtained from the IETF on-line IPR repository at
767*ae771770SStanislav Sedov   http://www.ietf.org/ipr.
768*ae771770SStanislav Sedov
769*ae771770SStanislav Sedov   The IETF invites any interested party to bring to its attention any
770*ae771770SStanislav Sedov   copyrights, patents or patent applications, or other proprietary
771*ae771770SStanislav Sedov   rights that may cover technology that may be required to implement
772*ae771770SStanislav Sedov   this standard.  Please address the information to the IETF at
773*ae771770SStanislav Sedov   ietf-ipr@ietf.org.
774*ae771770SStanislav Sedov
775*ae771770SStanislav SedovAcknowledgement
776*ae771770SStanislav Sedov
777*ae771770SStanislav Sedov   Funding for the RFC Editor function is provided by the IETF
778*ae771770SStanislav Sedov   Administrative Support Activity (IASA).
779*ae771770SStanislav Sedov
780*ae771770SStanislav Sedov
781*ae771770SStanislav Sedov
782*ae771770SStanislav Sedov
783*ae771770SStanislav Sedov
784*ae771770SStanislav Sedov
785*ae771770SStanislav Sedov
786*ae771770SStanislav SedovZeilenga                    Standards Track                    [Page 14]
787*ae771770SStanislav Sedov
788