summaryrefslogtreecommitdiff
path: root/i18npool/qa/cppunit
AgeCommit message (Collapse)Author
2024-11-04new loplugin:staticconstexprNoel Grandin
Change-Id: Ida1996dfffa106bf95fd064e8191b8033b4002f3 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/175336 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2024-07-17tdf#150621 Changed Korean word counting to use wordsJonathan Clark
Previously, Writer counted characters for all CJK languages, rather than words. This is the correct behavior for Chinese and Japanese, which make extensive use of ideographs. However, it is not correct for Korean. This change adjusts the Writer word count algorithm to count Korean words, rather than Korean characters. Change-Id: I6e77136867baca1a7b51248886ee5fd7073ad364 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/170621 Tested-by: Jenkins Reviewed-by: Jonathan Clark <jonathan@libreoffice.org>
2024-07-16tdf#114160 Regression tests for non-breaking ZWJJonathan Clark
Per Unicode rules, ZWJ shouldn't be treated as a line break opportunity. This change adds regression tests to verify lines do not break at ZWJ. tdf#114160 was a regression introduced during a prior ICU upgrade. This was fixed by commit 44699b3de37f07090ac6fee1cd97aa76036e9700, but no tests were added for ZWJ at that time. Change-Id: Ieca919daea11dc161ae35fb6ffe5bd366cf4a6f0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/170534 Tested-by: Jenkins Reviewed-by: Jonathan Clark <jonathan@libreoffice.org>
2024-07-16tdf#46950 Allow intra-word right double quotation markJonathan Clark
Hebrew text may use the character RIGHT DOUBLE QUOTATION MARK as a substitute for HEBREW PUNCTUATION GERSHAYIM. This change customizes the ICU word BreakIterator rules to that end. Change-Id: I03a48729de103505a2f68f9a1635c0f0cd7d126a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/170536 Reviewed-by: Jonathan Clark <jonathan@libreoffice.org> Tested-by: Jenkins
2024-07-02BootstrapFixture: get rid of mxComponentContextXisco Fauli
Change-Id: I0318485c3c0159277e47096e0c7e0df8ed109ea4 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169865 Tested-by: Jenkins Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org>
2024-06-27tdf#161737 i18npool: fix bad word selection with NNBSPLászló Németh
Fix word breaking rules also for editing. Previously the word was selected with the following narrow no-break space, e.g. at French words before exclamation and question marks (where narrow no-break space allows to get correct typography, if the OpenType/Graphite font doesn't have this feature). Add this and the previous fixes for Hungarian, which handled by extra word-breaking rule files. Follow-up to commit 6e002da1615b52cda4e9331e87878458b1fe9677 "tdf#161737 i18npool: fix fake spelling alarms with NNBSP". Change-Id: I7230bd356e5f0360172b652e615a61d96131d336 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169624 Tested-by: Jenkins Reviewed-by: László Németh <nemeth@numbertext.org>
2024-06-27tdf#161737 i18npool: fix fake spelling alarms with NNBSPLászló Németh
Fix word break by excluding narrow no-break space at the end of the words for spell checking. This was a problem e.g. for French, where (automatically? or manually) inserted narrow no-break space is used to get correct typography before exclamation and question marks, also after and before guillemets, if the OpenType/Graphite font doesn't have this feature). Regression from commit 44699b3de37f07090ac6fee1cd97aa76036e9700 "tdf#49885 BreakIterator rule upgrades". Note: this fixes also the problem, when digits separated by NNBSP thousand separator weren't handled by spell checking, alarming fake spelling mistakes, when "Check words with numbers" was enabled in Tools->Options->Languages and Locales->Writing Aids. (TODO: at the case of thousand separators, remove NBSP by the linguistic module or by the spell checking dictionaries to allow to check numbers with thousand separators and with correct suffix.) Change-Id: I36e10add7e0ba840f207a375ccc8668dbfef9572 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169618 Tested-by: Jenkins Reviewed-by: László Németh <nemeth@numbertext.org>
2024-06-27tdf#138258 i18npool: allow ASCII double quote to match typographic quoteLászló Németh
Similar to the straight (typewriter or ASCII) apostrophe, straight double quotation mark (") matches its typographic variants now, like other word processors do. Note: regex search doesn't use this matching, similar to the apostrophe search. Follow-up to commit d40f2d02df26e216f367b5da3f9546b73f250469 "tdf#117643 Writer: fix apostrophe search regression". Change-Id: If6a3ee00750828583cd0cfc4aa7f7b656ea9bd1e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169605 Reviewed-by: László Németh <nemeth@numbertext.org> Tested-by: Jenkins
2024-05-20loplugin:ostr in variousNoel Grandin
Change-Id: I9f399b3752da9df930e0647536ffcd4e82beb1ac Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167856 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2024-05-09tdf#49885 BreakIterator rule upgradesJonathan Clark
This change re-bases the BreakIterator rule customizations on top of a clean copy of the ICU 74.2 rules. Change-Id: Iadcf16cab138cc6c869fac61ad64e996e65b5ae4 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166273 Tested-by: Jenkins Tested-by: Caolán McNamara <caolan.mcnamara@collabora.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-05-09loplugin:ostr in i18npoolNoel Grandin
Change-Id: I0176d93b38788e28fa42baad293597f98eaa7a21 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167378 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2024-05-03tdf#49885 Updated CJK BreakIterator to use ICUJonathan Clark
Previously, the CJK BreakIterator used custom dictionaries for Chinese and Japanese. This change removes these custom dictionaries in favor of the upstream ICU implementation, which uses an externally-maintained frequency dictionary for these languages. This change also removes support code for dictionary-based break iterators, as it is no longer used. Change-Id: I55c4ce9c842d1751997309fd7446e0a6917915dc Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166136 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> Tested-by: Jenkins Tested-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-04-20tdf#49885 Removed custom Thai BreakIteratorJonathan Clark
Previously, a custom BreakIterator was used for Thai grapheme clusters. This change deletes the custom BreakIterator, in favor of the ICU implementation. Change-Id: Icec94c73a5734c2059786dfbba085f487c488d7c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166156 Tested-by: Jenkins Reviewed-by: Eike Rathke <erack@redhat.com>
2024-04-17tdf#49885 Reviewed BreakIterator customizationsJonathan Clark
This change completes the review of BreakIterator rule customizations, and adds unit tests for relevant customizations. Change-Id: I06678fcccfc48d020aac64dd9f58ff36a763af30 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166017 Tested-by: Jenkins Reviewed-by: Eike Rathke <erack@redhat.com>
2024-04-12tdf#147021 Replace SAL_N_ELEMENTS with std::sizeAshwani5009
As part of the efforts in #145538 to replace the SAL_N_ELEMENTS() macro with std::size() and std::ssize(), this commit performs the necessary changes for a few files in the i18npool module. Change-Id: Ic64be31b74cd74faf17497a47d6a15158b85184c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/166013 Tested-by: Jenkins Tested-by: Ilmari Lauhakangas <ilmari.lauhakangas@libreoffice.org> Reviewed-by: Ilmari Lauhakangas <ilmari.lauhakangas@libreoffice.org>
2023-11-20Extended loplugin:ostr: i18npoolStephan Bergmann
Change-Id: Ia0a844bc6e3f27758c869f5e229097085288e8bf Reviewed-on: https://gerrit.libreoffice.org/c/core/+/159698 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-10-20Extended loplugin:ostr: Automatic rewrite O[U]StringLiteral: i18npoolStephan Bergmann
Change-Id: If3eb4d8fb3068e26ce42c8cc751c2de38b5d04cb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/158202 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-10-15Repurpose loplugin:stringstatic for O[U]String vars that can be constexprStephan Bergmann
...now that warning about O[U]String vars that could be O[U]StringLiteral is no longer useful Change-Id: I389e72038171f28482049b41f6224257dd11f452 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/157992 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-10-07loplugin:ostr: automatic rewriteStephan Bergmann
Change-Id: I2d09b2b83e1b50493ec88d0b2c323a83c0c86395 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/157647 Reviewed-by: Stephan Bergmann <sbergman@redhat.com> Tested-by: Jenkins
2023-07-24i18npool: Test case folding of surrogate pairsKhaled Hosny
Change-Id: I3097651927b85aaa46fc4fc59badf22d24fcb928 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154872 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-07-24tdf#97152: Fix upper case mapping of lunate sigma (U+03F2)Khaled Hosny
It was mapped to uppercase sigma (U+03A3) while it should be mapped to uppercase lunate sigma (U+03F9). Fix by letting this slot fallback to ICU case folding. Change-Id: I14ffa0151c740779b67af14be8c7af8c51c3a1e0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154845 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-07-24tdf#96343, tdf#134766, tdf#97152: Fallback to ICU for case mappingKhaled Hosny
If we are requested to case map a character not present in our case mapping data, fallback to ICU case mapping functions. We should switch completely to ICU at some point, but we need to evaluate our case mapping data and see if it differs from ICU and if there is a reason for it. Does not handle the case of U+03F2 turning into Sigma from tdf#97152. Change-Id: Icf13ac7aab6d07b2a90fc0ff5ef1c4f50c7a7f8c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154803 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-07-24i18npool: Add a test for sigma case foldingKhaled Hosny
This is one of the special case folding characters. Change-Id: Icfe986b216eb62ed595402b31908c2fd22cd475e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154821 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-07-24CppunitTest_i18npool_characterclassification: use CPPUNIT_TEST_FIXTURE()Khaled Hosny
Change-Id: I6cc87255af385116b7e86ceaea67b26ca1f44709 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154806 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-06-21Require icu-i18n >= 66Khaled Hosny
We were requiring ICU 4.6 which was released in 2011, and ifdef'ing our way through newer ICU versions. ICU is a core dependency and it makes no sense to build LibreOffice with such ancient versions of it. This change requires ICU 66 (released in 2020), and removes all the ifdefs for older versions. There are more cleanups to do, but these will be done separately. Change-Id: I2e4f7608a08f4d531b0a4c74bbfdf91a451f833f Reviewed-on: https://gerrit.libreoffice.org/c/core/+/153387 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@libreoffice.org>
2023-04-25Add some tests for (Japanese) i18n::IndexEntrySupplier behaviorStephan Bergmann
...in preparation for some upcoming i18npool/util/i18npool.component clean-up Change-Id: I8e93aa33759f2bdd6b9422b3833a608cfbed1df0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/150948 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-04-13Fix UBSan function-type-mismatchStephan Bergmann
...as seen with the additions to CppunitTest_i18npool_transliteration made here, > i18npool/source/transliteration/textToPronounce_zh.cxx:175:17: runtime error: call to function get_zh_zhuyin through pointer to incorrect function type 'unsigned short **(*)()' > workdir/CustomTarget/i18npool/indexentry/zh_zhuyin.cxx:1512: note: get_zh_zhuyin defined here > #0 in i18npool::TextToPronounce_zh::TextToPronounce_zh(char const*) at i18npool/source/transliteration/textToPronounce_zh.cxx:175:17 > #1 in i18npool::TextToChuyin_zh_TW::TextToChuyin_zh_TW() at i18npool/source/transliteration/textToPronounce_zh.cxx:149:5 > #2 in TextToChuyin_zh_TW_CreateInstance(com::sun::star::uno::Reference<com::sun::star::lang::XMultiServiceFactory> const&) at i18npool/source/registerservices/registerservices.cxx:236:1 > #3 in cppu::(anonymous namespace)::OFactoryComponentHelper::createInstanceEveryTime(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/factory.cxx:173:24 > #4 in cppu::(anonymous namespace)::OFactoryComponentHelper::createInstanceWithContext(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/factory.cxx:230:12 > #5 in non-virtual thunk to cppu::(anonymous namespace)::OFactoryComponentHelper::createInstanceWithContext(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/factory.cxx > #6 in cppuhelper::ServiceManager::Data::Implementation::doCreateInstance(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/servicemanager.cxx:709:26 > #7 in cppuhelper::ServiceManager::Data::Implementation::createInstance(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&, bool) at cppuhelper/source/servicemanager.cxx:675:16 > #8 in cppuhelper::ServiceManager::createInstanceWithContext(rtl::OUString const&, com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/servicemanager.cxx:1006:36 > #9 in non-virtual thunk to cppuhelper::ServiceManager::createInstanceWithContext(rtl::OUString const&, com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&) at cppuhelper/source/servicemanager.cxx > #10 in i18npool::TransliterationImpl::loadBody(rtl::OUString const&, com::sun::star::uno::Reference<com::sun::star::i18n::XExtendedTransliteration>&) at i18npool/source/transliteration/transliterationImpl.cxx:619:45 > #11 in i18npool::TransliterationImpl::loadModuleByName(std::basic_string_view<char16_t, std::char_traits<char16_t>>, com::sun::star::uno::Reference<com::sun::star::i18n::XExtendedTransliteration>&, com::sun::star::lang::Locale const&) at i18npool/source/transliteration/transliterationImpl.cxx:630:5 > #12 in i18npool::TransliterationImpl::loadModuleByImplName(rtl::OUString const&, com::sun::star::lang::Locale const&) at i18npool/source/transliteration/transliterationImpl.cxx:278:9 > #13 in (anonymous namespace)::Transliteration::testTextToChuyin_zh_TW() at i18npool/qa/cppunit/transliteration.cxx:120:27 For one, there had always been a mismatch between the return type `const sal_uInt16**` generated in i18npool/source/indexentry/genindex_data.cxx since d319c4611e932b286c0bef14387382da0f2e92d2 "INTEGRATION: CWS i18n24 (1.1.2); FILE ADDED" vs. `sal_uInt16**` used in i18npool/source/transliteration/textToPronounce_zh.cxx since f4705bf0a3efeebfe74568abb355ad60621300dd "INTEGRATION: CWS i18n24 (1.8.36); FILE MERGED". And for another (and more severe, as it caused random writes), there had also been a mismatch between the parameters `(sal_Int16 &max_index)` newly generated in i18npool/source/indexentry/genindex_data.cxx since 7696cd3902ca248951205f15930787488368ea26 "INTEGRATION: CWS i18n31 (1.4.60); FILE MERGED" (and correctly used in i18npool/source/indexentry/indexentrysupplier_asian.cxx since 58dcf0ffaf8668827fc2f47445c9d8faf3d29555 "INTEGRATION: CWS i18n31 (1.9.60); FILE MERGED") vs. the original `()` used in i18npool/source/transliteration/textToPronounce_zh.cxx ever since f4705bf0a3efeebfe74568abb355ad60621300dd "INTEGRATION: CWS i18n24 (1.8.36); FILE MERGED". For DISABLE_DYNLOADING, the second (missing max_index parameter) issue appears to have been broken even further with 9db03b879b912d79060ab06f03a54d4a59e6ac65 "i18npool: fix wrong static function symbols", replacing the wrong sal_uInt16** get_zh_zhuyin(); sal_uInt16** get_zh_pinyin(); declarations in i18npool/source/transliteration/textToPronounce_zh.cxx with the even worse declarations sal_uInt16** get_collator_data_zh_zhuyin(); sal_uInt16** get_collator_data_zh_pinyin(); corresponding to function definitions generated by i18npool/source/collator/gencoll_rule.cxx (which also happen to have zero parameters, but non-matching return types, and apparently completely different collation vs. transliteration semantics). Change-Id: Id91b17eeb7fcdd0c711d52a624375356dc47fc32 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/150302 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-04-04Fix typos in XML that broke three implementation elementsStephan Bergmann
The typos all originated with d2140a6320cd1cf4dea29b174cdb3bcb5261056b "i18npool: create instances with uno constructors", causing three intended constructor attributes to rather be plain character data. Which apparently went unnoticed until recently (see the TODOs resolved here that had been introduced with 456a146b9eb643655ae2bd336740e8c5536913aa "tdf#151971: Fix used implementation names of transliteration services"), in part because the Parser class in cppuhelper/source/servicemanager.cxx silently ignores any unexpected character data via xmlreader::XmlReader::Text::NONE.) Change-Id: Ia8fdbc09c67d10530b4d86dbbbde2b6b84038e66 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/150021 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-04-04tdf#151971: Fix used implementation names of transliteration servicesStephan Bergmann
...after 04af4e4f55f3ef319a78edd4d0109e2e7eba90b6 "[API CHANGE] Fix all bad UNOIDL identifiers across offapi" had changed the spelling (character case) of some of the css.i18n.TransliterationModules[New] enum values involved here, so that the TmItem1 macro generated broken TMList::implName values now. (Which in turn caused TransliterationImpl::loadBody to throw "unsatisfied query for interface of type com.sun.star.i18n.XExtendedTransliteration!" css::uno::RuntimeExceptions, which remained uncaught.) Also add a test verifying that loading all those transliteration services no longer fails throwing exceptions. Which lead to two open TODOs: For one, the value of maxCascade in i18npool/inc/transliterationImpl.hxx might come from a time when there were fewer TransliterationModules[New] enum values and might no longer be appropriate. This would need some further investigation. But for another, there are two transliteration services that cannot currently be instantiated. That looks like a regression that should be fixed in a follow-up commit. Change-Id: Icfca3e841360d4b471013e2c96d6868a75a21a1c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/150018 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-01-31tdf#147021 Use std::size() instead of SAL_N_ELEMENTS() macroektagoel12
Also change some range based for Change-Id: I2e17feaba7a6b219aa0c9126c5046cf3bdf855d8 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/145988 Tested-by: Jenkins Reviewed-by: Hossein <hossein@libreoffice.org>
2022-08-19tdf#91764: Combining marks from “complex” scripts can’t be searched forKhaled Hosny
Don’t skip search results that are in the middle of a grapheme cluster (AKA cell in LO speak). It is not clear why it was done like this, as these checks are present all the way back to the first commit of this file: commit 36eb193f4809221af42c01c5ac226a97cf74ec21 Author: Rüdiger Timm <rt@openoffice.org> Date: Tue Apr 8 15:01:00 2003 +0000 INTEGRATION: CWS calc06 (1.1.2); FILE ADDED 2003/03/26 15:54:42 er 1.1.2.1: #i3393# moved from i18n module, cleaned out tools module usage, and added support for regexp But ignoring such results and only for so-called “complex” scripts seems arbitrary, and as the linked issue shows, people want to be able to search for combining marks. Furthermore, it prevents searching for a base character followed by a combining mark, unless ignoring diacritics is enabled. Change-Id: I530788d928861ddfa18dd7b813d0a13f53c0b77b Reviewed-on: https://gerrit.libreoffice.org/c/core/+/138410 Tested-by: Jenkins Reviewed-by: خالد حسني <khaled@aliftype.com>
2022-04-15apply ICU test workaround to < 70 to "fix" test with ICU 71Rene Engelhard
See also 263961306ede0656ebb7904034a2172615ce81d0 Change-Id: Ib64ec43dba59ffddb34fe7f1a0f0d2e589c3455c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/133063 Tested-by: René Engelhard <rene@debian.org> Reviewed-by: Eike Rathke <erack@redhat.com>
2022-03-10Use icu::UnicodeString directlyMike Kaganski
Change-Id: I41b4e64d6d3a9310d819904c8d32c689e6300bcd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/131296 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2022-02-25tdf#145759 30.6001 -> monthDaysWithoutJanFebHossein
30.6001 shows month days without Jan and Feb. According to the below link, it is calcuated as (365-31-28)/10 = 30.6 but because of a floating point bug, it was used as 30.6001 as a workaround. "30.6001, 25 year old hack?" https://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv011.cgi?read=31650 The value 30.6 is used as i18nutil::monthDaysWithoutJanFeb here instead of 30.6001. The new value is ~30.60000038 which is > 30.6, so the calculations should be correct. In order to make sure, a unit test is added, and part of the values are checked against the values calculated by this website: Julian Day and Civil Date Calculator https://core2.gsfc.nasa.gov/time/julian.html Change-Id: I8cc7e046514dc3de652a1c37399e351cb2b614dc Reviewed-on: https://gerrit.libreoffice.org/c/core/+/125813 Tested-by: Jenkins Reviewed-by: Eike Rathke <erack@redhat.com>
2021-11-16Update to ICU 70.1Eike Rathke
Unicode 14, 5 new scripts, 12 new Unicode blocks. In i18npool/qa/cppunit/test_breakiterator.cxx TestBreakIterator::testLao() had to be disabled/adapted. Needs to be investigated, see comments there. As is, Lao script word break has regressions. Correct UBLOCK_TANGUT_SUPPLEMENT Unicode range endpoint to 0x18D7F, see https://www.unicode.org/versions/Unicode14.0.0/erratafixed.html for which ublock_getCode(0x18D8F) now returned UBLOCK_NO_BLOCK and thus luckily the assert in svx/source/dialog/charmap.cxx hit. Change-Id: I4bad16ecfab3f44be365b8f884c57f34af68218e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/125322 Reviewed-by: Eike Rathke <erack@redhat.com> Tested-by: Jenkins
2021-10-25Missing `static`Stephan Bergmann
...from aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve loplugin:stringliteralvar", similar to 8b32a3edad52f8ac5e5f0f49b4f4e80954c2fd25 "Fix stack-use-after-scope" (though this case doesn't appear to have caused any actual issues). (After manual inspection, there appear to be no further missing `static` at least in aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve loplugin:stringliteralvar".) Change-Id: I2b3d0d8d2af1d65f0c5bef8a858107020a620974 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/124137 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-10-14Avoid COW overhead using css::uno::SequenceMike Kaganski
The scenarios are: 1. Calling sequence's begin() and end() in pairs to pass to algorithms (both calls use getArray(), which does the COW checks) 2. In addition to #1, calling end() again when checking result of find algorithms, and/or begin() to calculate result's distance 3. Using non-const sequences in range-based for loops, which internally do #1 4. Assigning sequence to another sequence variable, and then modifying one of them In many cases, the sequences could be made const, or treated as const for the purposes of the algorithms (using std::as_const, std::cbegin, and std::cend). Where algorithm modifies the sequence, it was changed to only call getArray() once. For that, css::uno::toNonConstRange was introduced, which returns a struct (sublclass of std::pair) with two iterators [begin, end], that are calculated using one call to begin() and one call to getLength(). To handle #4, css::uno::Sequence::swap was introduced, that swaps the internal pointer to uno_Sequence. So when a local Sequence variable should be assigned to another variable, and the latter will be modified further, it's now possible to use swap instead, so the two sequences are kept independent. The modified places were found by temporarily removing non-const end(). Change-Id: I8fe2787f200eecb70744e8b77fbdf7a49653f628 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/123542 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2021-08-19tdf#143526 add Korean Numbering test case & fix Hanja number codepointDaeHyun Sung
add Korean Numbering test cases 1. koreanCounting 2. koreanLegal 3. koreanDigital 4. koreanDigital2 fix Korean Hanja number codepoint for Zero(0) Following MS Office's numFmt Strng example https://docs.microsoft.com/en-us/openspecs/office_standards/ms-docx/a1bb5809-e361-4e49-8e16-7f1a67da4121 Korean Hanja notation for Hanja is `零 U+96F6` on MS Word 2019 and that document. So, fix the Korean Hanja number code pointfor Zero(0) `零 U+96F6` Change-Id: I1a5b95640a93e7fbc3a0e724b154587877b198a0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/120676 Tested-by: Jenkins Reviewed-by: Eike Rathke <erack@redhat.com>
2021-07-29Make duplicate generated numbering identifiers unique, tdf#143526 follow-upEike Rathke
Change-Id: I28366c4e868e97b70e016b056b73b88b4cc8b812 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119677 Reviewed-by: Eike Rathke <erack@redhat.com> Tested-by: Jenkins
2021-07-29Add NumberingIdentifier unit test, tdf#143526 relatedEike Rathke
Change-Id: I9d4df6f63dc9ebc90e99fecce14b3551c74f7f1a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119675 Reviewed-by: Eike Rathke <erack@redhat.com> Tested-by: Jenkins
2021-06-14We only support ICU version 4.6 or newer, so drop these checksMike Kaganski
The minimal ICU version check is in configure.ac. Change-Id: Ib6480cd3290dabb45d87c6dcbcc9b5513d172e21 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/117119 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2021-05-14Improve loplugin:stringviewStephan Bergmann
Issue the "instead of O[U]String, pass [u16]string_view" diagnostic also for operator call arguments. (The "rather than copy, pass subView()" diagnostic is already part of handleSubExprThatCouldBeView, so no need to repeat it explicitly for operator call arguments.) (And many call sites don't even require an explicit [u16]string_view, esp. with the recent ad48b2b02f83eed41fb1eb8d16de7e804156fcf1 "Optimized OString operator += overloads". Just some test code in sal/qa/ that explicitly tests the O[U]String functionality had to be excluded.) Change-Id: I8d55ba5a7fa16a563f5ffe43d245125c88c793bc Reviewed-on: https://gerrit.libreoffice.org/c/core/+/115589 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-02-08Improve loplugin:cppunitassertequal for CPPUNIT_ASSERT(a && b)Stephan Bergmann
...by re-enabling the code temporarily #if'ed-out in a528392e71bc70136021be4e3d83732fccbb885e "Fixed/improved loplugin:cppunitassertequals" (and which then triggers lots of other lopglugin:cppunitassertequal CPPUNIT_ASSERT -> CPPUNIT_ASSERT_EQUAL warnings). For two css::uno::Reference equality comparisons in cppu/qa/test_any.cxx, it was more straightforward to rewrite them with an explicit call to operator == (which silences loplugin:cppunitassertequal) than to adapt them to CPPUNIT_ASSERT_EQUAL's requirement for arguments of identical types. In sc/qa/unit/ucalc_pivottable.cxx, ScDPItemData needs toString, which has been implemented trivially for now, but might want to combine that with the DEBUG_PIVOT_TABLE-only ScDPItemData::Dump. Change-Id: Iae6d09cf69bd4e52fe4411bba9e50c48e696291c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110546 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-01-27Improve loplugin:stringliteralvarStephan Bergmann
...to also consider O[U]String ctors taking pointer and length Change-Id: Iea5041634bfbf5054a1317701e30b56f72e940fb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110025 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-11-28drop custom Indic grapheme rules and rely on contemporary icu defaultsCaolán McNamara
similar to... commit 8578a1c9d167c19f1d8038fac5946b4b3cae305e Date: Thu Nov 26 15:47:26 2020 +0200 tdf#138481: Trust the built-in break iterator character data in ICU Don't use our own char.txt. the char_in.txt hasn't really changed since 2008 and is woefully out of date at this point. we have cppunit tests for the only documented bug that touched char_in.txt, #i111152# and tdf#40292, for tdf#40292 change the test to test what was actually reported as a bug Change-Id: I8e35b102b0a46d2c63e47e055e472892f65022ac Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106763 Tested-by: Jenkins Reviewed-by: Caolán McNamara <caolanm@redhat.com>
2020-11-13tdf#117643 Writer: fix apostrophe search regressionLászló Németh
During text search, ASCII apostrophe ' (U+0027) of the search term matches the typographic apostrophe ’ (U+2019) of the text, too. There was a UX regression in document editing from commit e6fade1ce133039d28369751b77ac8faff6e40cb (tdf#38395 enable smart apostrophe replacement by default), because Find and Replace window and Find toolbar doesn't replace ASCII apostrophe, so the search term hadn't matched the text (now with the automatically replaced typographic apostrophes), as before the commit. Regex search hasn't been modified, i.e. searching U+2019 is still necessary a search term with U+2019. The typographic apostrophes of a search term only match ASCII apostrophes of the text, if the search term contain also an ASCII apostrophe, too. Note: as a more sophisticated solution, it's possible to add a new default transliteration option for this later. Change-Id: I5121edbef5cf34fdd5b5f9ba3c046a06329a756a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105717 Tested-by: Jenkins Reviewed-by: László Németh <nemeth@numbertext.org>
2020-11-10Reinstate o3tl/cppunittraitshelper.hxx use for C++20Stephan Bergmann
...introduced with 5d8f0fad50f90195a11873c70ddab4644f5839ea "Adapt CPPUNIT_ASSERT to C++20 deleted ostream << for sal_Unicode (aka char16_t)" (see there for details) but erroneously removed with 877f40ac3f2add2b6dc37bae280d4d98dd102286 "tdf#42949 Fix new IWYU warnings in directories [h-r]*" Change-Id: Id22a4c0fdfe1471e2455ec3316f2c6c93cc00b22 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105549 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-11-10new loplugin:reducevarscopeNoel Grandin
Change-Id: Iefe922c2e0d605114d54673d63eccc5e4abd545d Reviewed-on: https://gerrit.libreoffice.org/c/core/+/102143 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-10tdf#42949 Fix new IWYU warnings in directories [h-r]*Gabor Kelemen
Found with bin/find-unneeded-includes Only removal proposals are dealt with here. Change-Id: I886b6f446293d3b1cfbf4ae05e8dbd7fabab9f20 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105510 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-06-04Upcoming loplugin:elidestringvar: i18npoolStephan Bergmann
Change-Id: I5644ca7f2ef1b251ce1c262d3001ca48f2ed9edd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/95482 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>