You are not logged in.
Pages: 1
The description of the UpperCaseU function is:
/// fast conversion of the supplied text into 8-bit uppercase
// - this will not only convert 'a'..'z' into 'A'..'Z', but also accentuated
// latin characters ('e' acute into 'E' e.g.), using NormToUpper[] array
// - it will therefore decode the supplied UTF-8 content to handle more than
// 7-bit of ascii characters (so this function is dedicated to WinAnsi code page
// 1252 characters set)
An example with following test:
var
ul, uh: RawUtf8;
begin
// Windows-1252 character set
// Ordinal numbers (decimal), https://de.wikipedia.org/wiki/Windows-1252
// - é (lower case): 233
// - É (upper case): 201
ul := UTF8Encode('étudiant');
uh := UTF8Encode('Étudiant');
// Result: ETUDIANT, Expected: ÉTUDIANT
ShowMessage(Utf8ToString(mormot.core.unicode.UpperCaseU(ul)));
// Result: etudiant, Expected: étudiant
ShowMessage(Utf8ToString(mormot.core.unicode.LowerCaseU(uh)));
Do I misunderstand the text of the description, or do I misinterpret the name of the function, or is my expectation simply wrong?
With best regards
Thomas
Offline
However, I would recommend renaming the function to something akin to NormalizeToUpperU/NormalizeToLowerU or similar, as UpperCaseU/LowerCaseU are similar to the original Delphi methods, which act differently.
Offline
@sakura
Those functions exist since more than 10 years in mORMot, renaming them may not be an option.
They clearly refer to the NormToUpper[] array, which has a defined behavior since 2008.
I have refined the documentation to avoid any confusion:
https://github.com/synopse/mORMot2/commit/ac6b729f7
Offline
Pages: 1