Age | Commit message (Collapse) | Author |
|
Unicode 14, 5 new scripts, 12 new Unicode blocks.
In i18npool/qa/cppunit/test_breakiterator.cxx
TestBreakIterator::testLao() had to be disabled/adapted.
Needs to be investigated, see comments there.
As is, Lao script word break has regressions.
Correct UBLOCK_TANGUT_SUPPLEMENT Unicode range endpoint to
0x18D7F, see
https://www.unicode.org/versions/Unicode14.0.0/erratafixed.html
for which ublock_getCode(0x18D8F) now returned UBLOCK_NO_BLOCK and
thus luckily the assert in svx/source/dialog/charmap.cxx hit.
Change-Id: I4bad16ecfab3f44be365b8f884c57f34af68218e
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/125322
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
...from aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve
loplugin:stringliteralvar", similar to 8b32a3edad52f8ac5e5f0f49b4f4e80954c2fd25
"Fix stack-use-after-scope" (though this case doesn't appear to have caused any
actual issues).
(After manual inspection, there appear to be no further missing `static` at
least in aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve
loplugin:stringliteralvar".)
Change-Id: I2b3d0d8d2af1d65f0c5bef8a858107020a620974
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/124137
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
The scenarios are:
1. Calling sequence's begin() and end() in pairs to pass to algorithms
(both calls use getArray(), which does the COW checks)
2. In addition to #1, calling end() again when checking result of find
algorithms, and/or begin() to calculate result's distance
3. Using non-const sequences in range-based for loops, which internally
do #1
4. Assigning sequence to another sequence variable, and then modifying
one of them
In many cases, the sequences could be made const, or treated as const
for the purposes of the algorithms (using std::as_const, std::cbegin,
and std::cend). Where algorithm modifies the sequence, it was changed
to only call getArray() once. For that, css::uno::toNonConstRange was
introduced, which returns a struct (sublclass of std::pair) with two
iterators [begin, end], that are calculated using one call to begin()
and one call to getLength().
To handle #4, css::uno::Sequence::swap was introduced, that swaps the
internal pointer to uno_Sequence. So when a local Sequence variable
should be assigned to another variable, and the latter will be modified
further, it's now possible to use swap instead, so the two sequences
are kept independent.
The modified places were found by temporarily removing non-const end().
Change-Id: I8fe2787f200eecb70744e8b77fbdf7a49653f628
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/123542
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
add Korean Numbering test cases
1. koreanCounting
2. koreanLegal
3. koreanDigital
4. koreanDigital2
fix Korean Hanja number codepoint for Zero(0)
Following MS Office's numFmt Strng example
https://docs.microsoft.com/en-us/openspecs/office_standards/ms-docx/a1bb5809-e361-4e49-8e16-7f1a67da4121
Korean Hanja notation for Hanja is `零 U+96F6` on MS Word 2019 and that
document.
So, fix the Korean Hanja number code pointfor Zero(0) `零 U+96F6`
Change-Id: I1a5b95640a93e7fbc3a0e724b154587877b198a0
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/120676
Tested-by: Jenkins
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I28366c4e868e97b70e016b056b73b88b4cc8b812
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119677
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
Change-Id: I9d4df6f63dc9ebc90e99fecce14b3551c74f7f1a
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119675
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
The minimal ICU version check is in configure.ac.
Change-Id: Ib6480cd3290dabb45d87c6dcbcc9b5513d172e21
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/117119
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Issue the "instead of O[U]String, pass [u16]string_view" diagnostic also for
operator call arguments. (The "rather than copy, pass subView()" diagnostic is
already part of handleSubExprThatCouldBeView, so no need to repeat it explicitly
for operator call arguments.)
(And many call sites don't even require an explicit [u16]string_view, esp. with
the recent ad48b2b02f83eed41fb1eb8d16de7e804156fcf1 "Optimized OString operator
+= overloads". Just some test code in sal/qa/ that explicitly tests the
O[U]String functionality had to be excluded.)
Change-Id: I8d55ba5a7fa16a563f5ffe43d245125c88c793bc
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/115589
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...by re-enabling the code temporarily #if'ed-out in
a528392e71bc70136021be4e3d83732fccbb885e "Fixed/improved
loplugin:cppunitassertequals" (and which then triggers lots of other
lopglugin:cppunitassertequal CPPUNIT_ASSERT -> CPPUNIT_ASSERT_EQUAL warnings).
For two css::uno::Reference equality comparisons in cppu/qa/test_any.cxx, it
was more straightforward to rewrite them with an explicit call to operator ==
(which silences loplugin:cppunitassertequal) than to adapt them to
CPPUNIT_ASSERT_EQUAL's requirement for arguments of identical types.
In sc/qa/unit/ucalc_pivottable.cxx, ScDPItemData needs toString, which has been
implemented trivially for now, but might want to combine that with the
DEBUG_PIVOT_TABLE-only ScDPItemData::Dump.
Change-Id: Iae6d09cf69bd4e52fe4411bba9e50c48e696291c
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110546
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...to also consider O[U]String ctors taking pointer and length
Change-Id: Iea5041634bfbf5054a1317701e30b56f72e940fb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110025
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
similar to...
commit 8578a1c9d167c19f1d8038fac5946b4b3cae305e
Date: Thu Nov 26 15:47:26 2020 +0200
tdf#138481: Trust the built-in break iterator character data in ICU
Don't use our own char.txt.
the char_in.txt hasn't really changed since 2008 and is woefully out of
date at this point.
we have cppunit tests for the only documented bug that touched
char_in.txt, #i111152# and tdf#40292, for tdf#40292 change the test
to test what was actually reported as a bug
Change-Id: I8e35b102b0a46d2c63e47e055e472892f65022ac
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106763
Tested-by: Jenkins
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
|
|
During text search, ASCII apostrophe ' (U+0027)
of the search term matches the typographic
apostrophe ’ (U+2019) of the text, too.
There was a UX regression in document editing from
commit e6fade1ce133039d28369751b77ac8faff6e40cb
(tdf#38395 enable smart apostrophe replacement by default),
because Find and Replace window and Find toolbar
doesn't replace ASCII apostrophe, so the search term
hadn't matched the text (now with the automatically
replaced typographic apostrophes), as before the commit.
Regex search hasn't been modified, i.e. searching U+2019
is still necessary a search term with U+2019.
The typographic apostrophes of a search term only match
ASCII apostrophes of the text, if the search term contain
also an ASCII apostrophe, too.
Note: as a more sophisticated solution, it's possible to
add a new default transliteration option for this later.
Change-Id: I5121edbef5cf34fdd5b5f9ba3c046a06329a756a
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105717
Tested-by: Jenkins
Reviewed-by: László Németh <nemeth@numbertext.org>
|
|
...introduced with 5d8f0fad50f90195a11873c70ddab4644f5839ea "Adapt
CPPUNIT_ASSERT to C++20 deleted ostream << for sal_Unicode (aka char16_t)" (see
there for details) but erroneously removed with
877f40ac3f2add2b6dc37bae280d4d98dd102286 "tdf#42949 Fix new IWYU warnings in
directories [h-r]*"
Change-Id: Id22a4c0fdfe1471e2455ec3316f2c6c93cc00b22
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105549
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: Iefe922c2e0d605114d54673d63eccc5e4abd545d
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/102143
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Found with bin/find-unneeded-includes
Only removal proposals are dealt with here.
Change-Id: I886b6f446293d3b1cfbf4ae05e8dbd7fabab9f20
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105510
Tested-by: Jenkins
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
|
|
Change-Id: I5644ca7f2ef1b251ce1c262d3001ca48f2ed9edd
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/95482
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
This is the last padded numbering type that is supported by Word but was
not supported by Writer.
Change-Id: Ica1a0843897c61a4b569105fd21e5bfe7b5012cb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90912
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
This is the actual numbering the customer needed, pad-to-2 and pad-to-3
was added just for compleness (since Word has it and it's related).
Change-Id: I7fdf67488955ab3ee0db169f11fffd21d9cc1e3b
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90791
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
This is similar to the existing padded numbering, but that one padded to
2. Another difference is pad-to-2 has more file format support:
pad-to-3 is not supported in DOC and RTF.
Change-Id: Ie2ac2691c58a89e181d24d7002cf873ebab380c4
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90656
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
lcl_formatArabicZero() looks a bit over-complicated with its hardcoded
limit of 2. Word supports limits of 3, 4 and 5 as well, so prepare for
handling them in a generic way.
Change-Id: If6e5634b11616f0ac05e1387016e22f4b171bbfb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/89864
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html> "char8_t
backward compatibility remediation", as implemented now by <https://gcc.gnu.org/
git/?p=gcc.git;a=commit;h=0c5b35933e5b150df0ab487efb2f11ef5685f713> "libstdc++:
P1423R3 char8_t remediation (2/4)" for -std=c++2a, deletes operator << overloads
that would print an integer rather than a (presumably expected) character.
But for simplicity (and to avoid issues with non-printing characters), keep
printing an integer here.
Change-Id: I751b99ee32d418eb488131ffa130d6f7d6d38dc7
Reviewed-on: https://gerrit.libreoffice.org/84348
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: I385587a922c555c320a45dcc6d644315b72510e9
Reviewed-on: https://gerrit.libreoffice.org/81278
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
It started out as a wrapper around character literals, but has by now become a
wrapper around arbitrary single characters. Besides updating the documentation,
this change is a mechanical
for i in $(git grep -Fl OUStringLiteral1); do sed -i -e s/OUStringLiteral1/OUStringChar/g "$i"; done
Change-Id: I1b9eaa4b3fbc9025ce4a4bffea3db1c16188b76f
Reviewed-on: https://gerrit.libreoffice.org/80892
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Specifically no "using ..." in a header file. For the 5 places
that actually need it..
Change-Id: I5a9d4efa3b19df51a05e7de0b4a825876290579c
Reviewed-on: https://gerrit.libreoffice.org/74814
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
Found with bin/find-unneeded-includes
Only removal proposals are dealt with here.
Change-Id: Ic331c845f0a8f06c4a8f8f79b6f87e26ca7c3a7d
Reviewed-on: https://gerrit.libreoffice.org/72972
Tested-by: Jenkins
Reviewed-by: Michael Stahl <Michael.Stahl@cib.de>
|
|
Change-Id: Ifce0dc836ea8500105ebcf3302f37ad6968929ec
Reviewed-on: https://gerrit.libreoffice.org/60607
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
idea originally from either tml or moggi, can't remember which
Change-Id: Id78d75035036d3aa1666e33469c6eeb38f9e624d
Reviewed-on: https://gerrit.libreoffice.org/55126
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
BreakIterator_CTL in the non CharacterIteratorMode::SKIPCELL mode did
not handle UTF-16 surrogate pairs at all, causing backspace to delete
lone surrogates which is really bad. Just copied the corresponding code
from BreakIterator_Unicode.
Additionally, BreakIterator_th was not correctly skipping non-Thai text
and always treating one character as Thai.
Change-Id: Ia379327e042ff602fc19a485c4cbd1a3683f9230
Reviewed-on: https://gerrit.libreoffice.org/54631
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: Iafdc3593b7136f24e741dc63e3c46344636154eb
|
|
auto-rewrite with <https://gerrit.libreoffice.org/#/c/47798/> "Enable
loplugin:cstylecast for some more cases" plus
solenv/clang-format/reformat-formatted-files
Change-Id: I5ca5f27425c150f58e5ec3f2392dda43a857fc33
|
|
Korean words are composed of Hangul and are separated
by space or newline. This patch improves line breaking
function in CJK break iterator so that it does not
break Korean words in the middle. It now breaks at the
first character of the last Korean word.
Change-Id: I91b20733c0c5ec4755bf68eb0d7c14c42c1f3556
Reviewed-on: https://gerrit.libreoffice.org/42987
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I23368c3ce6d29c7b2e758e209e5a8315e82a2818
Reviewed-on: https://gerrit.libreoffice.org/40051
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ia9b20a8ca95684cbeb21e3425972c43ba50df3cd
Reviewed-on: https://gerrit.libreoffice.org/39187
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ia4e02589d2fe79a27b83200a0e7a528a2c806519
Reviewed-on: https://gerrit.libreoffice.org/38508
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I7307cc96eac5868ed26e8ace1fc3c1a93e1bfec4
|
|
this modifies codemaker so that, for an UNO enum, we generate code
that effectively looks like:
#ifdef LIBO_INTERNAL_ONLY && HAVE_CX11_CONSTEXPR
enum class XXX {
ONE = 1
};
constexpr auto ONE = XXX_ONE;
#else
...the old normal way..
#endif
which means that for LO internal code, the enums are scoped.
The "constexpr auto" trick acts like an alias so we don't have to
use scoped naming everywhere.
Change-Id: I3054ecb230e8666ce98b4a9cb87b384df5f64fb4
Reviewed-on: https://gerrit.libreoffice.org/34546
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
We are classifying characters in the “Combining Diacritical Marks”
Unicode block with ScriptType::LATIN, but these are combining marks and
can combine with any script and should have been ScriptType::WEAK. Just
removing them from the range in scriptList does the trick as we will
fallback to getting the script classification based on the Unicode
script property.
Change-Id: I3577f4b03360a1c8e094a207f01b6bbb6abbaf30
Reviewed-on: https://gerrit.libreoffice.org/35811
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: I056fe8fb3e6b87ecae4e07f757c1a9588bbb1c06
|
|
and related css::util::SearchOptions2
The TransliterationModules enum has it's constants spread over multiple
UNO enum/constant-collections - TransliterationModules and
TransliterationModulesExtra, which means that most code simply uses
sal_Int32.
Wrap them up into a better bundle so that only the lowest layer needs to
deal directly with the UNO constants.
Change-Id: I1edeab79fcc7817a4a97c933ef84ab7015bb849b
Reviewed-on: https://gerrit.libreoffice.org/34582
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I10a8298e5379fa93a5d3616202a7802c0eda1cbb
|
|
Change-Id: I1cf7b7d20a0b567c7363c5a9abc5bf1195b57262
|
|
Change-Id: I2ebe54af7b769189e248b1a3af55ee1b6a66174a
Reviewed-on: https://gerrit.libreoffice.org/29399
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
Tested-by: Jenkins <ci@libreoffice.org>
|
|
Change-Id: If4bc7dd5af49cca85f474e817cc3cc358c2b48c2
|
|
Change-Id: I76f09a09fd6c3b114ba74737d4a1ba5dad0fd28f
|
|
Change-Id: I833ad2779d0eda6f5183b2dd062dffaa410a7937
|
|
At least '\' (search in Word) and '~' (search in Excel) should be
supported as escape character.
Being able to restrict a match to entire selection instead of substring
speeds up the Calc match whole cell scenario.
Change-Id: Ice242b9cd59009f172b724e03c2cc08feda4cd3c
|
|
Change-Id: Ifa23592f5e72438267926acf89c113a69a7f0ca0
|
|
With simple transliteration, TextSearch::searchForward used to use
whole string to perform the search, then started to create substring
to search. But it left the precautions from
commit c00601dab0f5533b152cd63cec0a89bfec1ba95f by Eike Rathke:
searching for $ may actually return a result set pointing behind the
search string which it does with the ICU regex engine.
The precaution made it to skip reverse mapping if index was 0.
Commit 9aae521b451269007f03527c83645b8b935eb419 by Michael Stahl
didn't consider the case when nStop is 0, and startPos > 0. Then it
tried to get offset[-1].
Anyway, using value of startPos in those conditions seems illogical.
Removed those precautions (and made assertions for that).
Fixed handling zero indexes.
Change-Id: I2066abc51fff7fb7323bc7f6198bdea06439d4f3
Reviewed-on: https://gerrit.libreoffice.org/19840
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I11822c50fa66d038a3d6f38054ab35c2e613f077
|
|
Change-Id: I2ea407acd763ef2d7dae2d3b8f32525523ac8274
|