Age | Commit message (Collapse) | Author |
|
Change-Id: Ia544298334364ece3b3963a4adc00c5e01189b91
Reviewed-on: https://gerrit.libreoffice.org/44654
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mark Page <aptitude@btconnect.com>
|
|
...at least in com_GCC_class.mk (com_MSC_class.mk will be addressed in a follow-
up commit), after the recent loplugin:includeform clean-up.
Two static libraries built from external sources needed adjustment, two
compilerplugin tests needed adjustment (which wasn't found by
loplugin:includeform, by design), and one more adjustment in
sal/textenc/generate/.
Change-Id: Idad5ae355a02ae130369a9a45b5f5925ab48ffef
Reviewed-on: https://gerrit.libreoffice.org/44174
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: I539ca8b9dee5edc5fc2282a2b9b0ffd78bad8b11
|
|
Change-Id: Ie75875974f054ff79bd64b1c261e79e2b78eb7fc
Reviewed-on: https://gerrit.libreoffice.org/43540
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
(probably because the encoding is RTL_TEXTENCODING_DONTKNOW; that may
still happen in various places, so can't use anything stronger than
SAL_WARN here)
Change-Id: I75993c9ad629104055c171389ed7708649747d9b
|
|
Change-Id: I4de6d9d781b2f2313d8fd338b34dcb31434efe91
Reviewed-on: https://gerrit.libreoffice.org/41638
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
<https://msdn.microsoft.com/en-us/library/windows/desktop/
dd374130(v=vs.85).aspx> "WideCharToMultiByte function" suggests that there now
is CP_SYMBOL, "Windows 2000: Symbol code page (42)." And a little test program
on Windows indicates that our RTL_TEXTENCODING_SYMBOL is working the same way as
CP_SYMBOL, where MultiByteToWideChar maps 00..1F to U+0000..1F and 20..FF to
U+F020..F0FF.
At least CppunitTest_writerfilter_rtftok, when testing
writerfilter/qa/cppunittests/rtftok/data/pass/EDB-18940-1.rtf, goes into case
RTF_FCHARSET in RTFDocumentImpl::dispatchValue
(writerfilter/source/rtftok/rtfdispatchvalue.cxx) with nParam matching
aRTFEncodings[2] (i.e., a mapping from charset 2 to codepage 42, see
writerfilter/source/rtftok/rtfcharsets.cxx), then passes 42 into
rtl_getTextEncodingFromWindowsCodePage and obtains an unhelpful
RTL_TEXTENCODING_DONTKNOW.
testFdo72031 (sw/qa/extras/rtfexport/rtfexport2.cxx, CppunitTest_sw_rtfexport2)
needed to be adapted, as the circled plus from the Symbol font is now internally
represented as U+F0C5, not (somewhat bogusly) as U+00C5 (aka LATIN CAPTIAL
LETTER A WITH RING ABOVE). But, when displayed with the Symobl font, the glyph
that is actually shown remains the circled plus.
Turns out changing rtl_getTextEncodingFromWindowsCodePage would start to make
CppunitTest_sw_rtfimport fail:
Sep 20 15:49:24 <sberg> vmiklos, with
<https://gerrit.libreoffice.org/#/c/42477/>, testN823675
(sw/qa/extras/rtfimport/rtfimport.cxx) fails, the aFont.Name is not "Symbol";
sw/qa/extras/rtfimport/data/n823675.rtf contains a \fonttbl that specifies
\f3 to have \fcharset2 (i.e., symbol font) and fontname "Symbol". However,
RTFDocumentImpl::checkUnicode
(writerfilter/source/rtftok/rtfdocumentimpl.cxx)
converts m_aHexBuffer (containing "Symbol;") with nCurrentEncoding apparently
being the encoding specified by \fcharset2 (i.e., now RTL_TEXTENCODING_SYMBOL
instead of old RTL_TEXTENCODING_DONTKNOW), so the resulting OUString is
garbage
(instead of the byte-for-byte conversion to Unicode "Symbol;" that
RTL_TEXTENCODING_DONTKNOW would do there); do you know whether such \fonttbl
fontnames should actually be interpreted in the given \fcharset?
Sep 20 15:49:24 <IZBot> gerrit: »Map Windows code page 42 to
RTL_TEXTENCODING_SYMBOL« by Stephan Bergmann for master [NEW]
Sep 20 15:51:15 <vmiklos> sberg: let me check if the spec covers that
Sep 20 15:54:29 <mst_> sberg: i think the name is typically encoded in the
font's encoding but probably they have to make a (likely undocumented)
exception for symbol encoding
Sep 20 15:57:46 <vmiklos> sberg: the spec only says that \fcharset is about
the encoding of the content using that font, i don't see it described what
would be the encoding of the font name itself
Sep 20 15:58:51 <vmiklos> sberg: i'm not sure about if that encoding should or
should not affect the encoding of the font name in general, but indeed at
least for 2 (symbol encoding) you're right, Word doesn't encoding the font
name with that encoding, either.
Sep 20 15:59:30 <sberg> vmiklos, mst_, at the top of page 14 of
Word2007RTFSpec9.docx I see "Note that runs of text marked with a particular
font index (see \fN in the Font Table section) use the codepage for that font
as given by \cpgN or implied by \fcharsetN, unless they use Unicode RTF
described in the following section." Would that match what mst_ says?
Sep 20 15:59:33 <vmiklos> so if it helps you case to handle at as e.g. ascii,
just for that encoding, i think there would be no problem with that.
Sep 20 16:00:07 <vmiklos> sberg: that still talks about the content using the
font, not the strings (font names) in the font table itself, i think.
Sep 20 16:01:17 <sberg> vmiklos, what's the control word to select such a
font, also \fN? I don't see any such in n823675.rtf
Sep 20 16:02:16 <mikekaganski> loircbot: e.g. \af3
Sep 20 16:02:31 <mikekaganski> sberg: ^
Sep 20 16:02:47 <mst_> 04d5a280beeeb6e056df68395dc9c3b3a674361b
Sep 20 16:02:50 <IZBot> core - related: fdo#77979: writerfilter RTF import:
read encoded font name -
http://cgit.freedesktop.org/libreoffice/core/commit/?id=04d5a280beeeb6e056df68395dc9c3b3a674361b
Sep 20 16:02:52 <mst_> sberg: ^
Sep 20 16:04:05 <sberg> mst_, thanks; so there's likely an (implicit?)
exception for \fcharset2, as you say
Sep 20 16:04:33 <mst_> that's most plausible, our own font code is full of
exceptions for "symbol fonts" too
Sep 20 16:05:19 <sberg> mikekaganski, ENOCONTEXT
Sep 20 16:05:36 <mikekaganski> sberg: [17:01:16] sberg: vmiklos, what's the
control word to select such a font, also \fN? I don't see any such in
n823675.rtf
Sep 20 16:06:32 <sberg> mikekaganski, so you say selection is done with \af3
instead of \f3?
Sep 20 16:06:40 <mikekaganski> sberg: yes, in that case
Sep 20 16:07:34 <mst_> i think there are several different keywords that apply
fonts, but can't remember the whole list
Sep 20 16:08:10 <mst_> \fN shoudl be one of them though
Sep 20 16:22:18 <sberg> vmiklos, so who generated that
sw/qa/extras/rtfimport/data/n823675.rtf, was it manually created and lacks a
\cpgN before "Symbol"?
Sep 20 16:29:17 <sberg> vmiklos, (after further reading of the RTF spec):
disregard the "and lacks a \cpgN before 'Symbol'" part of my above question
Sep 20 16:30:27 <mst_> sberg: i suggest not reading too much about encoding in
RTF, it gets pretty lovecraftian pretty fast...
Sep 20 16:32:58 <vmiklos> sberg: given how short that bugdoc is, i'm pretty
sure i cut it down manually to something readable from a multi-MB real bugdoc
Sep 20 16:33:07 <sberg> mst_, do you have a recommendation how I could get
that "don't use symbol font encoding to read a symbol font's name" into
writerfilter/source/rtftok/rtfdocumentimpl.cxx?
RTFDocumentImpl::checkUnicode lacks the context to tell whether it is using
m_aStates.top().nCurrentEncoding to convert a fontname, and the caller of
checkUnicode (at the end of RTFDocumentImpl::resolveChars in this case)
appears to lack the context, too
Sep 20 16:33:12 <mst_> various Old Ones from The Time Before Unicode and their
Backward Compatibility Tentacles etc.
Sep 20 16:34:59 <sberg> vmiklos, anyway, that "so there's likely an
(implicit?) exception for \fcharset2" hypothesis sounds sane, so we should
probably implement it (if only you or mst_ can give me a good hint how...)
Sep 20 16:35:13 <vmiklos> sberg: looking for a code pointer
Sep 20 16:36:05 <mst_> sberg: m_aStates.top().eDestination ==
Destination::FONTENTRY should be the relevant check?
Sep 20 16:36:17 <vmiklos> sberg: RTFDocumentImpl::text() is where the text is
taken, Destination::FONTENTRY is the state on the parser stack which is a
font entry in the font table. so to detect "your case" during decoding a byte
array into a string, m_aStates.top().eDestination == Destination::FONTENTRY
is what you want
Sep 20 16:36:35 <vmiklos> ah good, two independent matching hints are
promising ;)
Sep 20 16:37:35 <sberg> mst_, vmiklos, ah; but what also looks dodgy is that
checkUnicode operates there on "Symbol;" including the closing ";" of the
full <fontinfo>, not just the <fontname> part of the <fontinfo>
Sep 20 16:39:24 <vmiklos> sberg: i think we already assume that the only
"token" in the font entry destination that is not bound to a control world
(\foo) is the font name
Sep 20 16:40:52 <vmiklos> sberg:
writerfilter/source/rtftok/rtfdocumentimpl.cxx:1237 is where we simply strip
away the trailing semicolon, there is no further separation between the font
name and other character content inside the destination (apart from the
control words and their arguments)
Sep 20 16:42:18 <sberg> vmiklos, OK, thanks; I'll just pretend I haven't seen
those dodgy details :)
...so I'm switching to (somewhat arbitrarily) RTL_TEXTENCODING_MS_1252 there now
Change-Id: Iebd1bcecb7fa71c489798154d3356062b052775e
Reviewed-on: https://gerrit.libreoffice.org/42477
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
Tested-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...so that at least some typos of using OUSTRING_TO_OSTRING_CVTFLAGS (0x566)
instead of OSTRING_TO_OUSTRING_CVTFLAGS (0x333) can be found. (Unfortunately,
in the other direction, 0x333 is a valid combination of
RTL_UNICODETOTEXT_FLAGS_*.)
Change-Id: I7cfb3677b103ae90de88833cc93b0a5384607e15
Reviewed-on: https://gerrit.libreoffice.org/42288
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
There are apparently various places that want to check for a Unicode scalar
value rather than for a Unicode code point. Changed those uses of
rtl::isUnicodeCodePoint where that was obvious. (For changing
svtools/source/svrtf/svparser.cxx see 8e0fb74dc01927b60d8b868548ef8fe1d7a80ce3
"Revert 'svtools: HTML import: don't put lone surrogates in OUString'".) Other
uses of rtl::isUnicodeCodePoint might also want to use rtl::isUnicodeScalarValue
instead.
As a side effect, this change also introduces rtl::isSurrogate, which is useful
in a few places as well.
Change-Id: I9245f4f98b83877145a4d392f0ddb7c5d824a535
|
|
Change-Id: Ia37347108f9fe7094f055a5c6f2ec9511c3aff1d
|
|
Consider non-shortest forms, surrogates, and representations of values larger
than 0x10FFFF (which can even cover five or six bytes, for historical reasons)
as "invalid" (they used to be considered as "undefined" instead).
This is in response to fc670f637d4271246691904fd649358ce2e7be59 "svtools: HTML
import: don't put lone surrogates in OUString" (which can now be reverted again
in a follow-up commit). My fear would have been that some places in the code
rely on the original, relaxed handling, but at least 'make check' still
succeeded for me.
Change-Id: I017e6c04ed3c577c3694b417167f853987a1d1ce
|
|
Change-Id: Iaf3ed48d0eb0e5a57770af057c565a7310bb96d4
Reviewed-on: https://gerrit.libreoffice.org/40761
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: I795059109e23800987cda6f04c58ab18c488ad07
Reviewed-on: https://gerrit.libreoffice.org/41242
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Julien Nabet <serval2412@yahoo.fr>
|
|
Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3
Reviewed-on: https://gerrit.libreoffice.org/41194
Reviewed-by: Michael Stahl <mstahl@redhat.com>
Tested-by: Michael Stahl <mstahl@redhat.com>
|
|
I have kept the old mispelled constant for backwards compatibility
Change-Id: I128a2eec76d00cc5ef058cd6a0c35a7474d2411e
Reviewed-on: https://gerrit.libreoffice.org/39995
Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com>
Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
|
|
Change-Id: I5710b51e53779c222cec0bf08cd34bda330fec4b
Reviewed-on: https://gerrit.libreoffice.org/39737
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ic883a07b30069ca6342d7521c8ad890f4326f0ec
Reviewed-on: https://gerrit.libreoffice.org/39549
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
...as had been done in sal/textenc/tcvtlat1.tab, in
5034e8217c9844293dc94e5dff0bdc865ad7a91a "Translate German comments and debug
strings (leftovers in dirs sal to sc)". What those comments actually want to
say is that, despite being the ASCII control character region, those encodings
contain some graphic characters there.
Change-Id: Iefad9c45341891c3e91e6b9fb265d3070d1220ad
|
|
Translates leftovers found using a custom regex and manually checking
the rest of the affected file.
Additionally:
- A few spelling fixes
Change-Id: I2b83bd11adf520b90bb29c8ea624990759dad3c6
Reviewed-on: https://gerrit.libreoffice.org/39427
Reviewed-by: Michael Stahl <mstahl@redhat.com>
Tested-by: Michael Stahl <mstahl@redhat.com>
|
|
Change-Id: I0fee8bcddaeea48335e3be05761d2ad2c45020e2
Reviewed-on: https://gerrit.libreoffice.org/39238
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: If00e371c3cd3ae616309a172c875faed016e391b
Reviewed-on: https://gerrit.libreoffice.org/38773
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I7b2680898dbfc49185fb949349d81f4ac615a470
Reviewed-on: https://gerrit.libreoffice.org/38593
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
with command
> git grep -l osl/diagnose.h *.cxx |
xargs grep -L -w 'OSL_\w*' |
xargs sed -i '/#include *\(<\|\"\)osl\/diagnose.h\(>\|\"\).*/d'
headers need more work
Change-Id: I906519ebbd47a04703b4fa5943b2f7abea7a97ab
Reviewed-on: https://gerrit.libreoffice.org/37350
Tested-by: Jochen Nitschke <j.nitschke+logerrit@ok.de>
Reviewed-by: Michael Stahl <mstahl@redhat.com>
|
|
run it against sal,cppu,cppuhelper
I had to run this multiple times to catch all the cases in each module,
and it requires some hand-tweaking of the resulting output - clang-tidy
is not very good about cleaning up trailing spaces, and aligning
things nicely.
Change-Id: I00336345f5f036e12422b98d66526509380c497a
Reviewed-on: https://gerrit.libreoffice.org/36194
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
the start value is out by one row
Change-Id: I77ed154358516ccd47a090cf7ed45bb609bc81a3
Reviewed-on: https://gerrit.libreoffice.org/34992
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: I5fc62060e7d01c6b492a0e91323f753cc676bf71
Reviewed-on: https://gerrit.libreoffice.org/35639
Reviewed-by: Julien Nabet <serval2412@yahoo.fr>
Tested-by: Julien Nabet <serval2412@yahoo.fr>
|
|
Change-Id: Iedca07be5300c68e180e0c71d2d6eb0052f5cced
Reviewed-on: https://gerrit.libreoffice.org/34801
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
sjis 82 and 83 correctly denote the last index available
while 84 reports one past the last available index
0x84BE is the last value to map, not 0x84BF
Change-Id: Idcadc2d554ee59586f6e2f2775301fe69c94d55a
|
|
... in modules reportdesign to sdext
Replace with C++11 delete copy-constructur and
copy-assignment.
Remove boost/noncopyable.hpp includes.
Make some overloaded ctors explicit
(most in sd slidesorter).
Add deleted copy-assignment in sc/inc/chart2uno.hxx.
Change-Id: I21d4209f0ddb00063ca827474516a05ab4bb2f9a
Reviewed-on: https://gerrit.libreoffice.org/23970
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noelgrandin@gmail.com>
|
|
Including no keywords from extern "C" blocks
Change-Id: I87f2ed75888b51ec9e0cb75566bf7c2351b479b4
Reviewed-on: https://gerrit.libreoffice.org/23675
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
...and fix its documentation, and use it throughout the code base.
Change-Id: I349bc2009b1b0aa7115ea90bc6ecd0a812f63698
|
|
Change-Id: I1bc6c87fcd6e5e96362623be94c59be216a3b2b8
|
|
Change-Id: Ia41f4f0ca30ae3346d0720271478ec5bcdab797b
Reviewed-on: https://gerrit.libreoffice.org/18967
Reviewed-by: Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de>
Tested-by: Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de>
|
|
Most important perl scripts already use a portable method
but fix these nevertheless.
(cherry picked from commit 93496a6c1e5b2ae1756dc9b8043e2267209387da)
Change-Id: Id8b5e974356701e66f9dd8a6bd7a58fd4f5633ae
|
|
This addresses some cppcheck warnings.
Change-Id: Ie492fb9c106b37c3fe7b0105236ad6315f4f159e
Reviewed-on: https://gerrit.libreoffice.org/17921
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Thorsten Behrens <Thorsten.Behrens@CIB.de>
|
|
This may reduce some degree of dependency on boost.
Done by running a script like:
git grep -l '#include *.boost/scoped_array.hpp.' \
| xargs sed -i -e 's@#include *.boost/scoped_array.hpp.@#include <memory>@'
git grep -l '\(boost::\)\?scoped_array<\([^<>]*\)>' \
| xargs sed -i -e 's/\(boost::\)\?scoped_array<\([^<>]*\)>/std::unique_ptr<\2[]>/'
... and then killing duplicate or unnecessary includes,
while changing manually
m_xOutlineStylesCandidates in xmloff/source/text/txtimp.cxx,
extensions/source/ole/unoconversionutilities.hxx, and
extensions/source/ole/oleobjw.cxx.
Change-Id: I3955ed3ad99b94499a7bd0e6e3a09078771f9bfd
Reviewed-on: https://gerrit.libreoffice.org/16289
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: Ibd373cddb1e25f05528e627349953b5f7d115330
|
|
Change-Id: Iad01e628b692a6dcf882696908de0ef1f24c04c4
|
|
Change-Id: Ie225e9816a626e2581b8309f59519ab99e1af04a
|
|
Change-Id: Ib5920745f9f464493b7b134f871e090726d17d0c
|
|
rtl/string.hxx and rtl/ustring.hxx both unnecessarily #include <sal/log.hxx>
(and don't make use of it themselves), but many other files happen to depend on
it. Cleaned up some, but something like
grep -FwL sal/log.hxx $(git grep -Elw \
'SAL_INFO|SAL_INFO_IF|SAL_WARN|SAL_WARN_IF') -- \*.cxx)
shows lots more files that potentially need fixing before the include can be
removed from rtl/string.hxx and rtl/ustring.hxx.
Change-Id: Ibf033363e83d37851776f392dc0b077381cd8b90
|
|
ie.
void f(void);
becomes
void f();
I used the following command to make the changes:
git grep -lP '\(\s*void\s*\)' -- *.cxx \
| xargs perl -pi -w -e 's/(\w+)\s*\(\s*void\s*\)/$1\(\)/g;'
and ran it for both .cxx and .hxx files.
Change-Id: I314a1b56e9c14d10726e32841736b0ad5eef8ddd
|
|
Change-Id: I7d3c36e564b3a5286ebd32527575472313119587
|
|
...in accordance with
<http://www.openoffice.org/udk/cpp/man/spec/textconversion.html>
Change-Id: I62013f89c421722770123db8a5794e63d3572e6b
|
|
Change-Id: I5e370445affbcd32b05588111f74590bf24f39d6
|
|
and we can include a few less headers
Change-Id: Id742849ff4c1c37a2b861aa3d6ab823f00ea87f8
|
|
...by adding some further SAL_DLLPUBLIC_RTTI type annotations (cf.
b4f6b26b5a1a78fecfa95ec2eb7ac8b80495d8aa "SAL_DLLPUBLIC_RTTI for proper RTTI
visibility for LLVM") and by making sure relevant function types do not use
incomplete types in their parameter and return types (which would make the RTTI
hidden).
Change-Id: Id7aadcbc0704b9759968ae36266fc9ce11a2e340
|
|
Change-Id: Ie54d340478412e62b87d66e287fd8a3963e97898
|
|
Change-Id: I0ad9681a8b31d78cefce5b66040415154a1c7a99
|
|
This addresses some cppcheck warnings.
Change-Id: Id5f90757571e76a2c05a4cbd37020e1f6a6b2033
Reviewed-on: https://gerrit.libreoffice.org/13544
Reviewed-by: Noel Grandin <noelgrandin@gmail.com>
Tested-by: Noel Grandin <noelgrandin@gmail.com>
|