summaryrefslogtreecommitdiff
path: root/include/rtl/textcvt.h
AgeCommit message (Collapse)Author
2021-02-02tdf#130978 Added comment to all published APImsrijita18
Change-Id: I744788bde9778f85ccd9d7667e19d16842900a29 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110248 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-09-12Replace remaining uses of sal_CharJulien Nabet
+ remove sal_Char check on compilerplugins Change-Id: I0f7da14e620f0c3d031d038aa8345ba4080fb3e9 Change-Id: Ia6dba4f27b47bc9e0c89159182ad80a5aee17166 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/102499 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2019-09-04[API CHANGE] rtl_convertTextToUnicode behavior upon erroneous inputStephan Bergmann
<http://udk.openoffice.org/cpp/man/spec/textconversion.html> specifies that FLAGS_UNDEFINED_ERROR, FLAGS_MBUNDEFINED_ERROR, and FLAGS_INVALID_ERROR: "Read past the [erroneous] code in the input buffer [...]" But actual behavior of rtl_convertTextToUnicode for the various rtl_TextEncoding values has been inconsistent. Some erroneous input (mostly single-byte UNDEFINED and INVALID ones) has not been consumed at all, some (multi-byte MBUNDEFINED and INVALID) has been consumed partly, and some has been consumed fully as required. However, at least since 8dd4265b9ddbd7786b6237676909eae5b540da0e "CWS-TOOLING: integrate CWS hb18", Custom8BitToUnicode in sw/source/filter/ww8/ww8par.cxx appears to rely on the broken behavior of not consuming erroneous input. (It reads the chunk of valid input with e.g. some RTL_TEXTENCODING_MS_125x that happens to exhibit the broken behavior of not consuming erroneous input, then wants to try to re-read the erroneous input with RTL_TEXTENCODING_MS_1252. For example, opening sw/qa/core/data/ww8/pass/forcepoint50-grfanchor-1.doc triggers that code. For whatever reason, the am_faksas.dot attached to <https://bz.apache.org/ooo/show_bug.cgi?id=9240#c1> "Do not show lithuanian letter 'Š'" appears to not, or at least no longer, trigger that code.) Therefore, it would be useful to have a mode in which rtl_convertTextToUnicode does not consume erroneous input. (And I plan on doing changes in sal/osl/unx/file* that would benefit from that behavior, too.) But changing rtl_convertTextToUnicode to generally not consume erroneous input would not be feasible: If calls do not set RTL_TEXTTOUNICODE_FLAGS_FLUSH, part of an erroneous input can already have been consumed by a previous call, so the current call cannot undo that. But a change that looks like it can work is to change the behavior only if RTL_TEXTTOUNICODE_FLAGS_FLUSH is set. In that case we can at least not consume the part of an erroneous input that has not yet been consumed by a previous call (which would necessarily have been done with RTL_TEXTTOUNICODE_FLAGS_FLUSH unset). The expecation is that code that relies on the don't-consume behavior will do only single calls with RTL_TEXTTOUNICODE_FLAGS_FLUSH set (so reliably not consume the complete erroneous input), while other code (which might do calls in a loop) will not care whether erroneous input has been consumed, anyway. This can be considered a mild form of behavioral API CHANGE (but note that the old implementation didn't exhibit the requested behavior anyway). So all implementations of rtl_convertTextToUnicode for the various rtl_TextEncoding values have been adapted to the new behavior. The only exceptions are ImplDummyToUnicode (sal/textenc/textcvt.cxx), which is a special case anyway used by RTL_TEXTENCODING_DONTKNOW, and two out of three places (marked with a "TODO" each) in ImplUTF7ToUnicode (sal/textenc/tcvtutf7.cxx), where it is hard to retrofit the expected behaivor, and RTL_TEXTENCODING_UTF7 is probably not relevant for the use cases relying on the don't-consume--behavior, anyway. Whether a similar change should be done for rtl_convertUnicodeToText can be examined later. Change-Id: I1ac2c4cfd99e2a0eca219f9a3855ef110b254855 Reviewed-on: https://gerrit.libreoffice.org/78584 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2017-10-23loplugin:includeform: UNO API include filesStephan Bergmann
Change these back to consistently use the "..." form to include other UNO API include files, for the benefit of external users of this API. Change-Id: I9c9188e895eb3495e20a71ad44abfa2f6061fa94
2017-07-17RTL_UNICODETOTEXT_INFO_{DEST|SCR}BUFFERTOSMALL should use TOO, not TOChris Sherlock
I have kept the old mispelled constant for backwards compatibility Change-Id: I128a2eec76d00cc5ef058cd6a0c35a7474d2411e Reviewed-on: https://gerrit.libreoffice.org/39995 Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com> Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
2014-04-23RTL_UNICODETOTEXT_FLAGS_NOCOMPOSITE has never had any effectTor Lillqvist
Change-Id: I9004ec2229cd31fb899b23c8ce59f5fd49ac03a2
2013-11-09fdo#65108 inter-module includes <> include/rtlNorbert Thiebaud
Change-Id: Ic90a365a237aa23846f97131146a5aa2c46b5fd2
2013-10-23fixincludeguards.sh: include - the restThomas Arnhold
Change-Id: If1ee11da444a7f96f2d8668b277540da0bb4dbe9
2013-04-24move URE headers to include/David Tardon
Change-Id: Ib48a12e902f2311c295b2007f08f44dee28f431d Reviewed-on: https://gerrit.libreoffice.org/3499 Reviewed-by: David Tardon <dtardon@redhat.com> Tested-by: David Tardon <dtardon@redhat.com>