Age | Commit message (Collapse) | Author |
|
Change-Id: I879a52820d78d9151ef64dd21612379f617f66e2
Reviewed-on: https://gerrit.libreoffice.org/42726
Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com>
Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com>
|
|
<https://msdn.microsoft.com/en-us/library/windows/desktop/
dd374130(v=vs.85).aspx> "WideCharToMultiByte function" suggests that there now
is CP_SYMBOL, "Windows 2000: Symbol code page (42)." And a little test program
on Windows indicates that our RTL_TEXTENCODING_SYMBOL is working the same way as
CP_SYMBOL, where MultiByteToWideChar maps 00..1F to U+0000..1F and 20..FF to
U+F020..F0FF.
At least CppunitTest_writerfilter_rtftok, when testing
writerfilter/qa/cppunittests/rtftok/data/pass/EDB-18940-1.rtf, goes into case
RTF_FCHARSET in RTFDocumentImpl::dispatchValue
(writerfilter/source/rtftok/rtfdispatchvalue.cxx) with nParam matching
aRTFEncodings[2] (i.e., a mapping from charset 2 to codepage 42, see
writerfilter/source/rtftok/rtfcharsets.cxx), then passes 42 into
rtl_getTextEncodingFromWindowsCodePage and obtains an unhelpful
RTL_TEXTENCODING_DONTKNOW.
testFdo72031 (sw/qa/extras/rtfexport/rtfexport2.cxx, CppunitTest_sw_rtfexport2)
needed to be adapted, as the circled plus from the Symbol font is now internally
represented as U+F0C5, not (somewhat bogusly) as U+00C5 (aka LATIN CAPTIAL
LETTER A WITH RING ABOVE). But, when displayed with the Symobl font, the glyph
that is actually shown remains the circled plus.
Turns out changing rtl_getTextEncodingFromWindowsCodePage would start to make
CppunitTest_sw_rtfimport fail:
Sep 20 15:49:24 <sberg> vmiklos, with
<https://gerrit.libreoffice.org/#/c/42477/>, testN823675
(sw/qa/extras/rtfimport/rtfimport.cxx) fails, the aFont.Name is not "Symbol";
sw/qa/extras/rtfimport/data/n823675.rtf contains a \fonttbl that specifies
\f3 to have \fcharset2 (i.e., symbol font) and fontname "Symbol". However,
RTFDocumentImpl::checkUnicode
(writerfilter/source/rtftok/rtfdocumentimpl.cxx)
converts m_aHexBuffer (containing "Symbol;") with nCurrentEncoding apparently
being the encoding specified by \fcharset2 (i.e., now RTL_TEXTENCODING_SYMBOL
instead of old RTL_TEXTENCODING_DONTKNOW), so the resulting OUString is
garbage
(instead of the byte-for-byte conversion to Unicode "Symbol;" that
RTL_TEXTENCODING_DONTKNOW would do there); do you know whether such \fonttbl
fontnames should actually be interpreted in the given \fcharset?
Sep 20 15:49:24 <IZBot> gerrit: »Map Windows code page 42 to
RTL_TEXTENCODING_SYMBOL« by Stephan Bergmann for master [NEW]
Sep 20 15:51:15 <vmiklos> sberg: let me check if the spec covers that
Sep 20 15:54:29 <mst_> sberg: i think the name is typically encoded in the
font's encoding but probably they have to make a (likely undocumented)
exception for symbol encoding
Sep 20 15:57:46 <vmiklos> sberg: the spec only says that \fcharset is about
the encoding of the content using that font, i don't see it described what
would be the encoding of the font name itself
Sep 20 15:58:51 <vmiklos> sberg: i'm not sure about if that encoding should or
should not affect the encoding of the font name in general, but indeed at
least for 2 (symbol encoding) you're right, Word doesn't encoding the font
name with that encoding, either.
Sep 20 15:59:30 <sberg> vmiklos, mst_, at the top of page 14 of
Word2007RTFSpec9.docx I see "Note that runs of text marked with a particular
font index (see \fN in the Font Table section) use the codepage for that font
as given by \cpgN or implied by \fcharsetN, unless they use Unicode RTF
described in the following section." Would that match what mst_ says?
Sep 20 15:59:33 <vmiklos> so if it helps you case to handle at as e.g. ascii,
just for that encoding, i think there would be no problem with that.
Sep 20 16:00:07 <vmiklos> sberg: that still talks about the content using the
font, not the strings (font names) in the font table itself, i think.
Sep 20 16:01:17 <sberg> vmiklos, what's the control word to select such a
font, also \fN? I don't see any such in n823675.rtf
Sep 20 16:02:16 <mikekaganski> loircbot: e.g. \af3
Sep 20 16:02:31 <mikekaganski> sberg: ^
Sep 20 16:02:47 <mst_> 04d5a280beeeb6e056df68395dc9c3b3a674361b
Sep 20 16:02:50 <IZBot> core - related: fdo#77979: writerfilter RTF import:
read encoded font name -
http://cgit.freedesktop.org/libreoffice/core/commit/?id=04d5a280beeeb6e056df68395dc9c3b3a674361b
Sep 20 16:02:52 <mst_> sberg: ^
Sep 20 16:04:05 <sberg> mst_, thanks; so there's likely an (implicit?)
exception for \fcharset2, as you say
Sep 20 16:04:33 <mst_> that's most plausible, our own font code is full of
exceptions for "symbol fonts" too
Sep 20 16:05:19 <sberg> mikekaganski, ENOCONTEXT
Sep 20 16:05:36 <mikekaganski> sberg: [17:01:16] sberg: vmiklos, what's the
control word to select such a font, also \fN? I don't see any such in
n823675.rtf
Sep 20 16:06:32 <sberg> mikekaganski, so you say selection is done with \af3
instead of \f3?
Sep 20 16:06:40 <mikekaganski> sberg: yes, in that case
Sep 20 16:07:34 <mst_> i think there are several different keywords that apply
fonts, but can't remember the whole list
Sep 20 16:08:10 <mst_> \fN shoudl be one of them though
Sep 20 16:22:18 <sberg> vmiklos, so who generated that
sw/qa/extras/rtfimport/data/n823675.rtf, was it manually created and lacks a
\cpgN before "Symbol"?
Sep 20 16:29:17 <sberg> vmiklos, (after further reading of the RTF spec):
disregard the "and lacks a \cpgN before 'Symbol'" part of my above question
Sep 20 16:30:27 <mst_> sberg: i suggest not reading too much about encoding in
RTF, it gets pretty lovecraftian pretty fast...
Sep 20 16:32:58 <vmiklos> sberg: given how short that bugdoc is, i'm pretty
sure i cut it down manually to something readable from a multi-MB real bugdoc
Sep 20 16:33:07 <sberg> mst_, do you have a recommendation how I could get
that "don't use symbol font encoding to read a symbol font's name" into
writerfilter/source/rtftok/rtfdocumentimpl.cxx?
RTFDocumentImpl::checkUnicode lacks the context to tell whether it is using
m_aStates.top().nCurrentEncoding to convert a fontname, and the caller of
checkUnicode (at the end of RTFDocumentImpl::resolveChars in this case)
appears to lack the context, too
Sep 20 16:33:12 <mst_> various Old Ones from The Time Before Unicode and their
Backward Compatibility Tentacles etc.
Sep 20 16:34:59 <sberg> vmiklos, anyway, that "so there's likely an
(implicit?) exception for \fcharset2" hypothesis sounds sane, so we should
probably implement it (if only you or mst_ can give me a good hint how...)
Sep 20 16:35:13 <vmiklos> sberg: looking for a code pointer
Sep 20 16:36:05 <mst_> sberg: m_aStates.top().eDestination ==
Destination::FONTENTRY should be the relevant check?
Sep 20 16:36:17 <vmiklos> sberg: RTFDocumentImpl::text() is where the text is
taken, Destination::FONTENTRY is the state on the parser stack which is a
font entry in the font table. so to detect "your case" during decoding a byte
array into a string, m_aStates.top().eDestination == Destination::FONTENTRY
is what you want
Sep 20 16:36:35 <vmiklos> ah good, two independent matching hints are
promising ;)
Sep 20 16:37:35 <sberg> mst_, vmiklos, ah; but what also looks dodgy is that
checkUnicode operates there on "Symbol;" including the closing ";" of the
full <fontinfo>, not just the <fontname> part of the <fontinfo>
Sep 20 16:39:24 <vmiklos> sberg: i think we already assume that the only
"token" in the font entry destination that is not bound to a control world
(\foo) is the font name
Sep 20 16:40:52 <vmiklos> sberg:
writerfilter/source/rtftok/rtfdocumentimpl.cxx:1237 is where we simply strip
away the trailing semicolon, there is no further separation between the font
name and other character content inside the destination (apart from the
control words and their arguments)
Sep 20 16:42:18 <sberg> vmiklos, OK, thanks; I'll just pretend I haven't seen
those dodgy details :)
...so I'm switching to (somewhat arbitrarily) RTL_TEXTENCODING_MS_1252 there now
Change-Id: Iebd1bcecb7fa71c489798154d3356062b052775e
Reviewed-on: https://gerrit.libreoffice.org/42477
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
Tested-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: I69d4157aaf6570cecd51ea59df20556914942e06
Reviewed-on: https://gerrit.libreoffice.org/42565
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Change-Id: I95b90128e93f0d88ed73601bcc5a7ca9279d4cf1
Reviewed-on: https://gerrit.libreoffice.org/42560
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Change-Id: Iadb0ebb66809c192fb817b8c7cf2f8cdb4d0b874
Reviewed-on: https://gerrit.libreoffice.org/42419
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
...after 0083b33650c2f584ceff6eeaf9ef6993bfe0ae9b "sal: -Werror,-Wsign-compare
(32-bit)" cast the check against SAL_MAX_UINT32 to unsigned, anyway. (And the
check for == 0 is already handled in the platform-generic part below.)
Change-Id: I0d0354cb9368bffef5d3aa835f865524d106a6f3
|
|
Change-Id: I204716eea112a1c99f6ac4df0d138c4c7d8b68e3
|
|
...that had been removed too eagerly with
1f543b817a7e8bdef9482c4c61bc1672bf04e39f "osl/w32: don't use 8-bit string
functions"
Change-Id: I9d16cc5ff9b779457d8d70c7f206d5e684342c63
|
|
Change-Id: I538fe5b41156366e0e87b3a93e58a3947afd18f5
Reviewed-on: https://gerrit.libreoffice.org/42398
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
https://msdn.microsoft.com/en-us/aa383745
Change-Id: I83528dc8e6a5e119e7aa816219d35f1ea3723b96
Reviewed-on: https://gerrit.libreoffice.org/42338
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
This uses the system TID as the LO thread identifier for Linux,
IOS and macOS, just as the Windows backend already does. While at
it use pthread functions on Linux, FreeBSD and MacOS to set the
thread name.
We already depend on MacOS 10.6 for dispatch support and Linux
supports pthread_setname_np since glibc 2.12, which is included in
our baseline. SYS_gettid is available since Linux 2.4.11.
I just copied the FreeBSD info from stackoverflow, while at it.
Change-Id: I39cdd09e952c0a2286d39f938c64b2d2d2f1ef91
Reviewed-on: https://gerrit.libreoffice.org/42071
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
Tested-by: Jenkins <ci@libreoffice.org>
|
|
Was missing in commit bc9a2ba677ce3fcd46c2bbef6e8faeacb14292c1.
Change-Id: I9b270d2363aee847afccb1a5d8f4c3d0af16fc27
|
|
Change-Id: I1f09d7a0f6c0c87b8b672d6bffcaa397ed4ff6e6
Reviewed-on: https://gerrit.libreoffice.org/42317
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
...so that at least some typos of using OUSTRING_TO_OSTRING_CVTFLAGS (0x566)
instead of OSTRING_TO_OUSTRING_CVTFLAGS (0x333) can be found. (Unfortunately,
in the other direction, 0x333 is a valid combination of
RTL_UNICODETOTEXT_FLAGS_*.)
Change-Id: I7cfb3677b103ae90de88833cc93b0a5384607e15
Reviewed-on: https://gerrit.libreoffice.org/42288
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
We don't support Windows versions that don't have Unicode support.
In all currently-supported Windows versions, all FooA() functions
are actually wrappers for FooW() variants. We don't need to fallback
to A functions, and also don't need to do useless Unicode<->8-bit
conversions that also may result in data loss.
Change-Id: Ie21337c150ec0b9b4386c27d46f6596c14c4dd9f
Reviewed-on: https://gerrit.libreoffice.org/42281
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Change-Id: If088efdcbf41807ac8fad2410953abb685c8ea01
Reviewed-on: https://gerrit.libreoffice.org/42274
Reviewed-by: Michael Stahl <mstahl@redhat.com>
Tested-by: Jenkins <ci@libreoffice.org>
|
|
There are apparently various places that want to check for a Unicode scalar
value rather than for a Unicode code point. Changed those uses of
rtl::isUnicodeCodePoint where that was obvious. (For changing
svtools/source/svrtf/svparser.cxx see 8e0fb74dc01927b60d8b868548ef8fe1d7a80ce3
"Revert 'svtools: HTML import: don't put lone surrogates in OUString'".) Other
uses of rtl::isUnicodeCodePoint might also want to use rtl::isUnicodeScalarValue
instead.
As a side effect, this change also introduces rtl::isSurrogate, which is useful
in a few places as well.
Change-Id: I9245f4f98b83877145a4d392f0ddb7c5d824a535
|
|
Change-Id: Ide7591b4012f23e8a3faa705537937efcac435a2
|
|
Change-Id: Ia37347108f9fe7094f055a5c6f2ec9511c3aff1d
|
|
Consider non-shortest forms, surrogates, and representations of values larger
than 0x10FFFF (which can even cover five or six bytes, for historical reasons)
as "invalid" (they used to be considered as "undefined" instead).
This is in response to fc670f637d4271246691904fd649358ce2e7be59 "svtools: HTML
import: don't put lone surrogates in OUString" (which can now be reverted again
in a follow-up commit). My fear would have been that some places in the code
rely on the original, relaxed handling, but at least 'make check' still
succeeded for me.
Change-Id: I017e6c04ed3c577c3694b417167f853987a1d1ce
|
|
Change-Id: Id78e9c0ca29ff2e52591f3d446431ac23c20ab7a
Reviewed-on: https://gerrit.libreoffice.org/41926
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
|
|
Change-Id: I132d3c66f0562e2c37a02eaf4c168d06c2b473eb
Reviewed-on: https://gerrit.libreoffice.org/41874
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I04be0f4fe8c98909b37586080096ee05341f956f
|
|
...as got broken with 3d5be8cd31bcf6fce8772133298d2ae076361362 "osl: give
warning on socket error (win32), move Flag definition (unx)"
Change-Id: Ib68540596b0bc2cda3e809e765c7d41ca45dda71
|
|
Change-Id: Ifc0b88963bcd28e5709accdf892b2cb16b2b55eb
|
|
Change-Id: Ie58189e254f31d77cb4adafe599c48e64ef6a1a3
Reviewed-on: https://gerrit.libreoffice.org/41611
Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com>
Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
|
|
Change-Id: I34b773f32a055dfe85ec9c42e72a9f51ee8fea10
Reviewed-on: https://gerrit.libreoffice.org/41610
Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com>
Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
|
|
Change-Id: I0354cf98ba3804505970e881dfff45a4d9a227da
Reviewed-on: https://gerrit.libreoffice.org/41609
Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com>
Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
|
|
Change-Id: I6979493a629874ab38629b655c47c299b24abdcd
Reviewed-on: https://gerrit.libreoffice.org/41608
Reviewed-by: Chris Sherlock <chris.sherlock79@gmail.com>
Tested-by: Chris Sherlock <chris.sherlock79@gmail.com>
|
|
Change-Id: Iaf3ed48d0eb0e5a57770af057c565a7310bb96d4
Reviewed-on: https://gerrit.libreoffice.org/40761
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
Change-Id: I61b006051e708636f0bba83b16de36f4571b8da7
|
|
Change-Id: I4e8ae42e2e0c285d34098bebd637ad6d4abaf6a0
|
|
Change-Id: I795059109e23800987cda6f04c58ab18c488ad07
Reviewed-on: https://gerrit.libreoffice.org/41242
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Julien Nabet <serval2412@yahoo.fr>
|
|
Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3
Reviewed-on: https://gerrit.libreoffice.org/41194
Reviewed-by: Michael Stahl <mstahl@redhat.com>
Tested-by: Michael Stahl <mstahl@redhat.com>
|
|
Each such precondition violation for that URE API function would already result
in osl_File_E_INVAL anyway.
Change-Id: I279949ae8f341e6272bb4574da712fd693461acd
|
|
Change-Id: I3d13e1c0bb6aa4a7aacc463198747c1368ebc9b4
Reviewed-on: https://gerrit.libreoffice.org/38114
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I899a8126c9d971601fea6c77eca165718aea0ac5
Reviewed-on: https://gerrit.libreoffice.org/41237
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ia470f643e3eefeccc14183133603db260460bd53
Reviewed-on: https://gerrit.libreoffice.org/41212
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
...introduced with aeb3853a21435f00f225d751e56184e875bc46ed "osl: (Win32) check
allocated pipe succeeded, otherwise needs to fail"
Change-Id: Ieeb0b1755e74f583d1b52447eb84f7512eb07914
|
|
To enable finding the source of the duplicate calls, I add new SAL
API (only for internal use) to retrieve and symbolise stack
backtraces.
The theory is that it relatively cheap to just store a backtrace,
but quite expense to symbolise it to strings. Note that the
backtrace() library we use on Linux does not do a particularly
good job, but it gives enough information that developers can use
the addr2line tool to get more precise info.
Explanation of fixes in the code that triggered the assert:
In SwFrameHolder, we need to only call StartListening() if the
pFrame member is actually changing. We also need to call
EndListening() on the old values when pFrame changes.
In SwNavigationPI, there is already a StartListening() call in
the only place we assign to m_pCreateView.
In ImpEditEngine, we need to ignore duplicates, because it is
doing a ref-counting thing. By storing duplicates on the listener
list, it doesn't need to keep track of which stylesheets its
child nodes are using. Given that it therefore will see
duplicate events, there is probably some performance optimisation
opportunities here.
In MasterPageObserver::Implementation::RegisterDocument, we
seem to be getting called multiple times with the same
SdDrawDocument, so just check if we've been registered already
before calling StartListening()
In SvxShape::impl_initFromSdrObject, do the same thing we do
elsewhere in this class, i.e. only call StartListening()
if the model has changed.
Change-Id: I7eae5c774e1e8d56f0ad7bec80e4df0017b287ac
Reviewed-on: https://gerrit.libreoffice.org/41045
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: Ief8bd59c903625ba65b75114b7b52c3b7ecbd331
Reviewed-on: https://gerrit.libreoffice.org/41019
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
|
|
Change-Id: I840f0638e02819398bae901d799a1882e0045d8e
|
|
Change-Id: Id88149b2ebbf869474192cc22b862093db21aeb6
|
|
As such the code handles it perfectly fine, and returns an error from
the function, but.
Change-Id: I356b8140381d3ccd21ff0a7f5c666552571b12f4
|
|
Change-Id: Ia94797159a617ff7c9c2d875e1f51892d5b698b2
|
|
Change-Id: I4fb364ceb34e0851f2d04c403333bf428e8cfa98
Reviewed-on: https://gerrit.libreoffice.org/40305
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Caolán McNamara <caolanm@redhat.com>
Tested-by: Caolán McNamara <caolanm@redhat.com>
|
|
drop down the the c api so we can truly pass ownership of the handle to
xNoAcquirePipe
Change-Id: I12acbec81726ae4a451b501bea5492a5865c0cc4
|
|
Change-Id: Ib23bbd9403f44fd7aa3635a3febb6533b1f9edad
|
|
Change-Id: I8918cd97f9ab89f0a2f7f95cd59b706ca5a55e2b
|
|
Change-Id: Ic9ddcaa7c699830216e157bd9dfc09d30b50b3e6
|