Age | Commit message (Collapse) | Author |
|
Don’t skip search results that are in the middle of a grapheme cluster
(AKA cell in LO speak).
It is not clear why it was done like this, as these checks are present
all the way back to the first commit of this file:
commit 36eb193f4809221af42c01c5ac226a97cf74ec21
Author: Rüdiger Timm <rt@openoffice.org>
Date: Tue Apr 8 15:01:00 2003 +0000
INTEGRATION: CWS calc06 (1.1.2); FILE ADDED
2003/03/26 15:54:42 er 1.1.2.1: #i3393# moved from i18n module, cleaned out tools module usage, and added support for regexp
But ignoring such results and only for so-called “complex” scripts seems
arbitrary, and as the linked issue shows, people want to be able to
search for combining marks. Furthermore, it prevents searching for a
base character followed by a combining mark, unless ignoring diacritics
is enabled.
Change-Id: I530788d928861ddfa18dd7b813d0a13f53c0b77b
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/138410
Tested-by: Jenkins
Reviewed-by: خالد حسني <khaled@aliftype.com>
|
|
See also 263961306ede0656ebb7904034a2172615ce81d0
Change-Id: Ib64ec43dba59ffddb34fe7f1a0f0d2e589c3455c
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/133063
Tested-by: René Engelhard <rene@debian.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I41b4e64d6d3a9310d819904c8d32c689e6300bcd
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/131296
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
30.6001 shows month days without Jan and Feb.
According to the below link, it is calcuated as (365-31-28)/10 = 30.6
but because of a floating point bug, it was used as 30.6001 as a
workaround.
"30.6001, 25 year old hack?"
https://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv011.cgi?read=31650
The value 30.6 is used as i18nutil::monthDaysWithoutJanFeb here
instead of 30.6001. The new value is ~30.60000038 which is > 30.6, so
the calculations should be correct. In order to make sure, a unit test
is added, and part of the values are checked against the values
calculated by this website:
Julian Day and Civil Date Calculator
https://core2.gsfc.nasa.gov/time/julian.html
Change-Id: I8cc7e046514dc3de652a1c37399e351cb2b614dc
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/125813
Tested-by: Jenkins
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Unicode 14, 5 new scripts, 12 new Unicode blocks.
In i18npool/qa/cppunit/test_breakiterator.cxx
TestBreakIterator::testLao() had to be disabled/adapted.
Needs to be investigated, see comments there.
As is, Lao script word break has regressions.
Correct UBLOCK_TANGUT_SUPPLEMENT Unicode range endpoint to
0x18D7F, see
https://www.unicode.org/versions/Unicode14.0.0/erratafixed.html
for which ublock_getCode(0x18D8F) now returned UBLOCK_NO_BLOCK and
thus luckily the assert in svx/source/dialog/charmap.cxx hit.
Change-Id: I4bad16ecfab3f44be365b8f884c57f34af68218e
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/125322
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
...from aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve
loplugin:stringliteralvar", similar to 8b32a3edad52f8ac5e5f0f49b4f4e80954c2fd25
"Fix stack-use-after-scope" (though this case doesn't appear to have caused any
actual issues).
(After manual inspection, there appear to be no further missing `static` at
least in aa2064c5c5f23f6f4b7bc44e12345b37f66995bc "Improve
loplugin:stringliteralvar".)
Change-Id: I2b3d0d8d2af1d65f0c5bef8a858107020a620974
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/124137
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
The scenarios are:
1. Calling sequence's begin() and end() in pairs to pass to algorithms
(both calls use getArray(), which does the COW checks)
2. In addition to #1, calling end() again when checking result of find
algorithms, and/or begin() to calculate result's distance
3. Using non-const sequences in range-based for loops, which internally
do #1
4. Assigning sequence to another sequence variable, and then modifying
one of them
In many cases, the sequences could be made const, or treated as const
for the purposes of the algorithms (using std::as_const, std::cbegin,
and std::cend). Where algorithm modifies the sequence, it was changed
to only call getArray() once. For that, css::uno::toNonConstRange was
introduced, which returns a struct (sublclass of std::pair) with two
iterators [begin, end], that are calculated using one call to begin()
and one call to getLength().
To handle #4, css::uno::Sequence::swap was introduced, that swaps the
internal pointer to uno_Sequence. So when a local Sequence variable
should be assigned to another variable, and the latter will be modified
further, it's now possible to use swap instead, so the two sequences
are kept independent.
The modified places were found by temporarily removing non-const end().
Change-Id: I8fe2787f200eecb70744e8b77fbdf7a49653f628
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/123542
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
add Korean Numbering test cases
1. koreanCounting
2. koreanLegal
3. koreanDigital
4. koreanDigital2
fix Korean Hanja number codepoint for Zero(0)
Following MS Office's numFmt Strng example
https://docs.microsoft.com/en-us/openspecs/office_standards/ms-docx/a1bb5809-e361-4e49-8e16-7f1a67da4121
Korean Hanja notation for Hanja is `零 U+96F6` on MS Word 2019 and that
document.
So, fix the Korean Hanja number code pointfor Zero(0) `零 U+96F6`
Change-Id: I1a5b95640a93e7fbc3a0e724b154587877b198a0
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/120676
Tested-by: Jenkins
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I28366c4e868e97b70e016b056b73b88b4cc8b812
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119677
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
Change-Id: I9d4df6f63dc9ebc90e99fecce14b3551c74f7f1a
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/119675
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
The minimal ICU version check is in configure.ac.
Change-Id: Ib6480cd3290dabb45d87c6dcbcc9b5513d172e21
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/117119
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Issue the "instead of O[U]String, pass [u16]string_view" diagnostic also for
operator call arguments. (The "rather than copy, pass subView()" diagnostic is
already part of handleSubExprThatCouldBeView, so no need to repeat it explicitly
for operator call arguments.)
(And many call sites don't even require an explicit [u16]string_view, esp. with
the recent ad48b2b02f83eed41fb1eb8d16de7e804156fcf1 "Optimized OString operator
+= overloads". Just some test code in sal/qa/ that explicitly tests the
O[U]String functionality had to be excluded.)
Change-Id: I8d55ba5a7fa16a563f5ffe43d245125c88c793bc
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/115589
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...by re-enabling the code temporarily #if'ed-out in
a528392e71bc70136021be4e3d83732fccbb885e "Fixed/improved
loplugin:cppunitassertequals" (and which then triggers lots of other
lopglugin:cppunitassertequal CPPUNIT_ASSERT -> CPPUNIT_ASSERT_EQUAL warnings).
For two css::uno::Reference equality comparisons in cppu/qa/test_any.cxx, it
was more straightforward to rewrite them with an explicit call to operator ==
(which silences loplugin:cppunitassertequal) than to adapt them to
CPPUNIT_ASSERT_EQUAL's requirement for arguments of identical types.
In sc/qa/unit/ucalc_pivottable.cxx, ScDPItemData needs toString, which has been
implemented trivially for now, but might want to combine that with the
DEBUG_PIVOT_TABLE-only ScDPItemData::Dump.
Change-Id: Iae6d09cf69bd4e52fe4411bba9e50c48e696291c
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110546
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...to also consider O[U]String ctors taking pointer and length
Change-Id: Iea5041634bfbf5054a1317701e30b56f72e940fb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110025
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
similar to...
commit 8578a1c9d167c19f1d8038fac5946b4b3cae305e
Date: Thu Nov 26 15:47:26 2020 +0200
tdf#138481: Trust the built-in break iterator character data in ICU
Don't use our own char.txt.
the char_in.txt hasn't really changed since 2008 and is woefully out of
date at this point.
we have cppunit tests for the only documented bug that touched
char_in.txt, #i111152# and tdf#40292, for tdf#40292 change the test
to test what was actually reported as a bug
Change-Id: I8e35b102b0a46d2c63e47e055e472892f65022ac
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106763
Tested-by: Jenkins
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
|
|
During text search, ASCII apostrophe ' (U+0027)
of the search term matches the typographic
apostrophe ’ (U+2019) of the text, too.
There was a UX regression in document editing from
commit e6fade1ce133039d28369751b77ac8faff6e40cb
(tdf#38395 enable smart apostrophe replacement by default),
because Find and Replace window and Find toolbar
doesn't replace ASCII apostrophe, so the search term
hadn't matched the text (now with the automatically
replaced typographic apostrophes), as before the commit.
Regex search hasn't been modified, i.e. searching U+2019
is still necessary a search term with U+2019.
The typographic apostrophes of a search term only match
ASCII apostrophes of the text, if the search term contain
also an ASCII apostrophe, too.
Note: as a more sophisticated solution, it's possible to
add a new default transliteration option for this later.
Change-Id: I5121edbef5cf34fdd5b5f9ba3c046a06329a756a
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105717
Tested-by: Jenkins
Reviewed-by: László Németh <nemeth@numbertext.org>
|
|
...introduced with 5d8f0fad50f90195a11873c70ddab4644f5839ea "Adapt
CPPUNIT_ASSERT to C++20 deleted ostream << for sal_Unicode (aka char16_t)" (see
there for details) but erroneously removed with
877f40ac3f2add2b6dc37bae280d4d98dd102286 "tdf#42949 Fix new IWYU warnings in
directories [h-r]*"
Change-Id: Id22a4c0fdfe1471e2455ec3316f2c6c93cc00b22
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105549
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: Iefe922c2e0d605114d54673d63eccc5e4abd545d
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/102143
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Found with bin/find-unneeded-includes
Only removal proposals are dealt with here.
Change-Id: I886b6f446293d3b1cfbf4ae05e8dbd7fabab9f20
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105510
Tested-by: Jenkins
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
|
|
Change-Id: I5644ca7f2ef1b251ce1c262d3001ca48f2ed9edd
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/95482
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
This is the last padded numbering type that is supported by Word but was
not supported by Writer.
Change-Id: Ica1a0843897c61a4b569105fd21e5bfe7b5012cb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90912
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
This is the actual numbering the customer needed, pad-to-2 and pad-to-3
was added just for compleness (since Word has it and it's related).
Change-Id: I7fdf67488955ab3ee0db169f11fffd21d9cc1e3b
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90791
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
This is similar to the existing padded numbering, but that one padded to
2. Another difference is pad-to-2 has more file format support:
pad-to-3 is not supported in DOC and RTF.
Change-Id: Ie2ac2691c58a89e181d24d7002cf873ebab380c4
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/90656
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
lcl_formatArabicZero() looks a bit over-complicated with its hardcoded
limit of 2. Word supports limits of 3, 4 and 5 as well, so prepare for
handling them in a generic way.
Change-Id: If6e5634b11616f0ac05e1387016e22f4b171bbfb
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/89864
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
Tested-by: Jenkins
|
|
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html> "char8_t
backward compatibility remediation", as implemented now by <https://gcc.gnu.org/
git/?p=gcc.git;a=commit;h=0c5b35933e5b150df0ab487efb2f11ef5685f713> "libstdc++:
P1423R3 char8_t remediation (2/4)" for -std=c++2a, deletes operator << overloads
that would print an integer rather than a (presumably expected) character.
But for simplicity (and to avoid issues with non-printing characters), keep
printing an integer here.
Change-Id: I751b99ee32d418eb488131ffa130d6f7d6d38dc7
Reviewed-on: https://gerrit.libreoffice.org/84348
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: I385587a922c555c320a45dcc6d644315b72510e9
Reviewed-on: https://gerrit.libreoffice.org/81278
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
It started out as a wrapper around character literals, but has by now become a
wrapper around arbitrary single characters. Besides updating the documentation,
this change is a mechanical
for i in $(git grep -Fl OUStringLiteral1); do sed -i -e s/OUStringLiteral1/OUStringChar/g "$i"; done
Change-Id: I1b9eaa4b3fbc9025ce4a4bffea3db1c16188b76f
Reviewed-on: https://gerrit.libreoffice.org/80892
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Specifically no "using ..." in a header file. For the 5 places
that actually need it..
Change-Id: I5a9d4efa3b19df51a05e7de0b4a825876290579c
Reviewed-on: https://gerrit.libreoffice.org/74814
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
Found with bin/find-unneeded-includes
Only removal proposals are dealt with here.
Change-Id: Ic331c845f0a8f06c4a8f8f79b6f87e26ca7c3a7d
Reviewed-on: https://gerrit.libreoffice.org/72972
Tested-by: Jenkins
Reviewed-by: Michael Stahl <Michael.Stahl@cib.de>
|
|
Change-Id: Ifce0dc836ea8500105ebcf3302f37ad6968929ec
Reviewed-on: https://gerrit.libreoffice.org/60607
Tested-by: Jenkins
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
idea originally from either tml or moggi, can't remember which
Change-Id: Id78d75035036d3aa1666e33469c6eeb38f9e624d
Reviewed-on: https://gerrit.libreoffice.org/55126
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
BreakIterator_CTL in the non CharacterIteratorMode::SKIPCELL mode did
not handle UTF-16 surrogate pairs at all, causing backspace to delete
lone surrogates which is really bad. Just copied the corresponding code
from BreakIterator_Unicode.
Additionally, BreakIterator_th was not correctly skipping non-Thai text
and always treating one character as Thai.
Change-Id: Ia379327e042ff602fc19a485c4cbd1a3683f9230
Reviewed-on: https://gerrit.libreoffice.org/54631
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: Iafdc3593b7136f24e741dc63e3c46344636154eb
|
|
auto-rewrite with <https://gerrit.libreoffice.org/#/c/47798/> "Enable
loplugin:cstylecast for some more cases" plus
solenv/clang-format/reformat-formatted-files
Change-Id: I5ca5f27425c150f58e5ec3f2392dda43a857fc33
|
|
Korean words are composed of Hangul and are separated
by space or newline. This patch improves line breaking
function in CJK break iterator so that it does not
break Korean words in the middle. It now breaks at the
first character of the last Korean word.
Change-Id: I91b20733c0c5ec4755bf68eb0d7c14c42c1f3556
Reviewed-on: https://gerrit.libreoffice.org/42987
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Change-Id: I23368c3ce6d29c7b2e758e209e5a8315e82a2818
Reviewed-on: https://gerrit.libreoffice.org/40051
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ia9b20a8ca95684cbeb21e3425972c43ba50df3cd
Reviewed-on: https://gerrit.libreoffice.org/39187
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ia4e02589d2fe79a27b83200a0e7a528a2c806519
Reviewed-on: https://gerrit.libreoffice.org/38508
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I7307cc96eac5868ed26e8ace1fc3c1a93e1bfec4
|
|
this modifies codemaker so that, for an UNO enum, we generate code
that effectively looks like:
#ifdef LIBO_INTERNAL_ONLY && HAVE_CX11_CONSTEXPR
enum class XXX {
ONE = 1
};
constexpr auto ONE = XXX_ONE;
#else
...the old normal way..
#endif
which means that for LO internal code, the enums are scoped.
The "constexpr auto" trick acts like an alias so we don't have to
use scoped naming everywhere.
Change-Id: I3054ecb230e8666ce98b4a9cb87b384df5f64fb4
Reviewed-on: https://gerrit.libreoffice.org/34546
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
We are classifying characters in the “Combining Diacritical Marks”
Unicode block with ScriptType::LATIN, but these are combining marks and
can combine with any script and should have been ScriptType::WEAK. Just
removing them from the range in scriptList does the trick as we will
fallback to getting the script classification based on the Unicode
script property.
Change-Id: I3577f4b03360a1c8e094a207f01b6bbb6abbaf30
Reviewed-on: https://gerrit.libreoffice.org/35811
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: I056fe8fb3e6b87ecae4e07f757c1a9588bbb1c06
|
|
and related css::util::SearchOptions2
The TransliterationModules enum has it's constants spread over multiple
UNO enum/constant-collections - TransliterationModules and
TransliterationModulesExtra, which means that most code simply uses
sal_Int32.
Wrap them up into a better bundle so that only the lowest layer needs to
deal directly with the UNO constants.
Change-Id: I1edeab79fcc7817a4a97c933ef84ab7015bb849b
Reviewed-on: https://gerrit.libreoffice.org/34582
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I10a8298e5379fa93a5d3616202a7802c0eda1cbb
|
|
Change-Id: I1cf7b7d20a0b567c7363c5a9abc5bf1195b57262
|
|
Change-Id: I2ebe54af7b769189e248b1a3af55ee1b6a66174a
Reviewed-on: https://gerrit.libreoffice.org/29399
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
Tested-by: Jenkins <ci@libreoffice.org>
|
|
Change-Id: If4bc7dd5af49cca85f474e817cc3cc358c2b48c2
|
|
Change-Id: I76f09a09fd6c3b114ba74737d4a1ba5dad0fd28f
|
|
Change-Id: I833ad2779d0eda6f5183b2dd062dffaa410a7937
|
|
At least '\' (search in Word) and '~' (search in Excel) should be
supported as escape character.
Being able to restrict a match to entire selection instead of substring
speeds up the Calc match whole cell scenario.
Change-Id: Ice242b9cd59009f172b724e03c2cc08feda4cd3c
|