/freebsd/usr.bin/sort/tests/ |
H A D | sort_monthsort_test.sh | 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b Fri Dec 01 01:30:10 CET 2023 Christos Margiolis <christos@FreeBSD.org> sort: test against all month formats in month-sort
The CLDR specification [1] defines three possible month formats:
- Abbreviation (e.g Jan, Ιαν) - Full (e.g January, Ιανουαρίου) - Standalone (e.g January, Ιανουάριος)
Many languages use different case endings depending on whether the month is referenced as a standalone word (nominative case), or in date context (genitive, partitive, etc.). sort(1)'s -M option currently sorts months by testing input against only the abbrevation format, which is essentially a substring of the full format. While this works fine for languages like English, where there are no cases, for languages where there is a different case ending between the abbreviation/full and standalone formats, it is not sufficient.
For example, in Greek, "May" can take the following forms:
Abbreviation: Μαΐ (genitive case) Full: Μαΐου (genitive case) Standalone: Μάιος (nominative case)
If we use the standalone format in Greek, sort(1) will not able to match "Μαΐ" to "Μάιος" and the sort will fail.
This change makes sort(1) test against all three formats. It also works when the input contains mixed formats.
[1] https://cldr.unicode.org/translation/date-time/date-time-patterns
Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D42847
|
H A D | Makefile | diff 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b Fri Dec 01 01:30:10 CET 2023 Christos Margiolis <christos@FreeBSD.org> sort: test against all month formats in month-sort
The CLDR specification [1] defines three possible month formats:
- Abbreviation (e.g Jan, Ιαν) - Full (e.g January, Ιανουαρίου) - Standalone (e.g January, Ιανουάριος)
Many languages use different case endings depending on whether the month is referenced as a standalone word (nominative case), or in date context (genitive, partitive, etc.). sort(1)'s -M option currently sorts months by testing input against only the abbrevation format, which is essentially a substring of the full format. While this works fine for languages like English, where there are no cases, for languages where there is a different case ending between the abbreviation/full and standalone formats, it is not sufficient.
For example, in Greek, "May" can take the following forms:
Abbreviation: Μαΐ (genitive case) Full: Μαΐου (genitive case) Standalone: Μάιος (nominative case)
If we use the standalone format in Greek, sort(1) will not able to match "Μαΐ" to "Μάιος" and the sort will fail.
This change makes sort(1) test against all three formats. It also works when the input contains mixed formats.
[1] https://cldr.unicode.org/translation/date-time/date-time-patterns
Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D42847
|
/freebsd/usr.bin/sort/ |
H A D | sort.1.in | diff 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b Fri Dec 01 01:30:10 CET 2023 Christos Margiolis <christos@FreeBSD.org> sort: test against all month formats in month-sort
The CLDR specification [1] defines three possible month formats:
- Abbreviation (e.g Jan, Ιαν) - Full (e.g January, Ιανουαρίου) - Standalone (e.g January, Ιανουάριος)
Many languages use different case endings depending on whether the month is referenced as a standalone word (nominative case), or in date context (genitive, partitive, etc.). sort(1)'s -M option currently sorts months by testing input against only the abbrevation format, which is essentially a substring of the full format. While this works fine for languages like English, where there are no cases, for languages where there is a different case ending between the abbreviation/full and standalone formats, it is not sufficient.
For example, in Greek, "May" can take the following forms:
Abbreviation: Μαΐ (genitive case) Full: Μαΐου (genitive case) Standalone: Μάιος (nominative case)
If we use the standalone format in Greek, sort(1) will not able to match "Μαΐ" to "Μάιος" and the sort will fail.
This change makes sort(1) test against all three formats. It also works when the input contains mixed formats.
[1] https://cldr.unicode.org/translation/date-time/date-time-patterns
Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D42847
|
H A D | bwstring.c | diff 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b Fri Dec 01 01:30:10 CET 2023 Christos Margiolis <christos@FreeBSD.org> sort: test against all month formats in month-sort
The CLDR specification [1] defines three possible month formats:
- Abbreviation (e.g Jan, Ιαν) - Full (e.g January, Ιανουαρίου) - Standalone (e.g January, Ιανουάριος)
Many languages use different case endings depending on whether the month is referenced as a standalone word (nominative case), or in date context (genitive, partitive, etc.). sort(1)'s -M option currently sorts months by testing input against only the abbrevation format, which is essentially a substring of the full format. While this works fine for languages like English, where there are no cases, for languages where there is a different case ending between the abbreviation/full and standalone formats, it is not sufficient.
For example, in Greek, "May" can take the following forms:
Abbreviation: Μαΐ (genitive case) Full: Μαΐου (genitive case) Standalone: Μάιος (nominative case)
If we use the standalone format in Greek, sort(1) will not able to match "Μαΐ" to "Μάιος" and the sort will fail.
This change makes sort(1) test against all three formats. It also works when the input contains mixed formats.
[1] https://cldr.unicode.org/translation/date-time/date-time-patterns
Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D42847
|