xref: /linux/fs/unicode/README.utf8data (revision 1215d239e791c54a3abb135553d32c9b91ae96ef)
1955405d1SGabriel Krisman BertaziThe utf8data.h file in this directory is generated from the Unicode
2*1215d239SGabriel Krisman BertaziCharacter Database for version 12.1.0 of the Unicode standard.
3955405d1SGabriel Krisman Bertazi
4955405d1SGabriel Krisman BertaziThe full set of files can be found here:
5955405d1SGabriel Krisman Bertazi
6*1215d239SGabriel Krisman Bertazi  http://www.unicode.org/Public/12.1.0/ucd/
7*1215d239SGabriel Krisman Bertazi
8*1215d239SGabriel Krisman BertaziNote!
9*1215d239SGabriel Krisman Bertazi
10*1215d239SGabriel Krisman BertaziThe URL's listed below are not stable.  That's because Unicode 12.1.0
11*1215d239SGabriel Krisman Bertazihas not been officially released yet; it is scheduled to be released
12*1215d239SGabriel Krisman Bertazion May 8, 2019.  We taking Unicode 12.1.0 a few weeks early because it
13*1215d239SGabriel Krisman Bertazicontains a new Japanese character which is required in order to
14*1215d239SGabriel Krisman Bertazispecify Japenese dates after May 1, 2019, when Crown Prince Naruhito
15*1215d239SGabriel Krisman Bertaziascends to the Chrysanthemum Throne.  (Isn't internationalization fun?
16*1215d239SGabriel Krisman BertaziThe abdication of Emperor Akihito of Japan is requiring dozens of
17*1215d239SGabriel Krisman Bertazisoftware packages to be updated with only a month's notice.  :-)
18*1215d239SGabriel Krisman Bertazi
19*1215d239SGabriel Krisman BertaziWe will update the URL's (and any needed changes to the checksums)
20*1215d239SGabriel Krisman Bertaziafter the final Unicode 12.1.0 is released.
21955405d1SGabriel Krisman Bertazi
22955405d1SGabriel Krisman BertaziIndividual source links:
23955405d1SGabriel Krisman Bertazi
24*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/CaseFolding-12.1.0d2.txt
25*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/DerivedAge-12.1.0d3.txt
26*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/extracted/DerivedCombiningClass-12.1.0d2.txt
27*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/DerivedCoreProperties-12.1.0d2.txt
28*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/NormalizationCorrections-12.1.0d1.txt
29*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/NormalizationTest-12.1.0d3.txt
30*1215d239SGabriel Krisman Bertazi  https://www.unicode.org/Public/12.1.0/ucd/UnicodeData-12.1.0d2.txt
31955405d1SGabriel Krisman Bertazi
32955405d1SGabriel Krisman Bertazimd5sums (verify by running "md5sum -c README.utf8data"):
33955405d1SGabriel Krisman Bertazi
34*1215d239SGabriel Krisman Bertazi  900e76da1d822a160fd6b8c0b1d70094  CaseFolding.txt
35*1215d239SGabriel Krisman Bertazi  131256380bff4fea8ad4a851616f2f10  DerivedAge.txt
36*1215d239SGabriel Krisman Bertazi  e731a4089b30002144e107e3d6f8d1fa  DerivedCombiningClass.txt
37*1215d239SGabriel Krisman Bertazi  a47c9fbd7ff92a9b261ba9831e68778a  DerivedCoreProperties.txt
38*1215d239SGabriel Krisman Bertazi  fcab6dad15e440879d92f315978f93d3  NormalizationCorrections.txt
39*1215d239SGabriel Krisman Bertazi  f9ff1c55a60decf436100f791b44aa98  NormalizationTest.txt
40*1215d239SGabriel Krisman Bertazi  755f6af699f8c8d2d958da411f78f6c6  UnicodeData.txt
41955405d1SGabriel Krisman Bertazi
42955405d1SGabriel Krisman Bertazisha1sums (verify by running "sha1sum -c README.utf8data"):
43955405d1SGabriel Krisman Bertazi
44*1215d239SGabriel Krisman Bertazi  dc9245f6803c4ac99555c361f5052e0b13eb779b  CaseFolding.txt
45*1215d239SGabriel Krisman Bertazi  3281104f237184cdb5d869e86eb8573678ada7da  DerivedAge.txt
46*1215d239SGabriel Krisman Bertazi  2f5f995ccb96e0fa84b15151b35d5e2681535175  DerivedCombiningClass.txt
47*1215d239SGabriel Krisman Bertazi  5b8698a3fcd5018e1987f296b02e2c17e696415e  DerivedCoreProperties.txt
48*1215d239SGabriel Krisman Bertazi  cd83935fbc012345d8792d2c704f69497e753835  NormalizationCorrections.txt
49*1215d239SGabriel Krisman Bertazi  ea419aae505b337b0d99a83fa83fe58ddff7c19f  NormalizationTest.txt
50*1215d239SGabriel Krisman Bertazi  dc973c0fc93d6f09d9ab9f70d1c9f89c447f0526  UnicodeData.txt
51*1215d239SGabriel Krisman Bertazi
52955405d1SGabriel Krisman Bertazi
53955405d1SGabriel Krisman BertaziTo update to the newer version of the Unicode standard, the latest
54955405d1SGabriel Krisman Bertazireleased version of the UCD can be found here:
55955405d1SGabriel Krisman Bertazi
56955405d1SGabriel Krisman Bertazi  http://www.unicode.org/Public/UCD/latest/
57955405d1SGabriel Krisman Bertazi
58955405d1SGabriel Krisman BertaziTo build the utf8data.h file, from a kernel tree that has been built,
59955405d1SGabriel Krisman Bertazicd to this directory (fs/unicode) and run this command:
60955405d1SGabriel Krisman Bertazi
61955405d1SGabriel Krisman Bertazi	make C=../.. objdir=../.. utf8data.h.new
62955405d1SGabriel Krisman Bertazi
63955405d1SGabriel Krisman BertaziAfter sanity checking the newly generated utf8data.h.new file (the
64*1215d239SGabriel Krisman Bertaziversion generated from the 12.1.0 UCD should be 4,109 lines long, and
65*1215d239SGabriel Krisman Bertazihave a total size of 324k) and/or comparing it with the older version
66955405d1SGabriel Krisman Bertaziof utf8data.h, rename it to utf8data.h.
67955405d1SGabriel Krisman Bertazi
68955405d1SGabriel Krisman BertaziIf you are a kernel developer updating to a newer version of the
69955405d1SGabriel Krisman BertaziUnicode Character Database, please update this README.utf8data file
70955405d1SGabriel Krisman Bertaziwith the version of the UCD that was used, the md5sum and sha1sums of
71955405d1SGabriel Krisman Bertazithe *.txt files, before checking in the new versions of the utf8data.h
72955405d1SGabriel Krisman Bertaziand README.utf8data files.
73