summaryrefslogtreecommitdiff
path: root/include/rtl
diff options
context:
space:
mode:
authorMike Kaganski <mike.kaganski@collabora.com>2017-08-23 09:09:57 +0300
committerMike Kaganski <mike.kaganski@collabora.com>2017-08-23 13:20:32 +0200
commit5b518ab051cc04e672ceb01da42b06625a1a4ce9 (patch)
tree69402bec04d6d9620e406cf677adbcd3a2be458a /include/rtl
parentd239bf6d79e93f650a4241fcd2da0cb77c9cb95b (diff)
tdf#111964: only trim XML whitespace
OUString::trim() uses rtl_uString_newTrim, which relies upon rtl_ImplIsWhitespace. The latter treats as whitespaces not only characters with values less than or equal to 32, but also Unicode General Punctuation area Space and some Control characters. Thus, using OUString::trim() is incorrect when the goal is to trim XML whitespace, which is defined as one of 0x09, 0x0A, 0x0D, 0x20. The comments for OUString::trim() and rtl_uString_newTrim are corrected to describe which characters are considered whitespace. A unit test included. Change-Id: I45a132be923a52dcd5a4c35aeecb53d423b49fec Reviewed-on: https://gerrit.libreoffice.org/41444 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
Diffstat (limited to 'include/rtl')
-rw-r--r--include/rtl/ustring.h4
-rw-r--r--include/rtl/ustring.hxx4
2 files changed, 6 insertions, 2 deletions
diff --git a/include/rtl/ustring.h b/include/rtl/ustring.h
index 831ecd66d9be..50dbd75a5ecc 100644
--- a/include/rtl/ustring.h
+++ b/include/rtl/ustring.h
@@ -2023,7 +2023,9 @@ SAL_DLLPUBLIC void SAL_CALL rtl_uString_newToAsciiUpperCase(
string.
The new string results from removing all characters with values less than
- or equal to 32 (the space character) form both ends of str.
+ or equal to 32 (the space character), and also Unicode General Punctuation
+ area Space and some Control characters, form both ends of str (see
+ rtl_ImplIsWhitespace).
This function cannot be used for language-specific conversion. The new
string does not necessarily have a reference count of 1 (in cases where
diff --git a/include/rtl/ustring.hxx b/include/rtl/ustring.hxx
index 602335e16768..c6ce9a73eb99 100644
--- a/include/rtl/ustring.hxx
+++ b/include/rtl/ustring.hxx
@@ -2947,7 +2947,9 @@ public:
of the string.
All characters that have codes less than or equal to
- 32 (the space character) are considered to be white space.
+ 32 (the space character), and Unicode General Punctuation area Space
+ and some Control characters are considered to be white space (see
+ rtl_ImplIsWhitespace).
If the string doesn't contain white spaces at both ends,
then the new string is assigned with str.