summaryrefslogtreecommitdiff
path: root/writerfilter
AgeCommit message (Collapse)Author
13 daysRelated: tdf#161652 sw, RTF paste: avoid duplicated numbering stylesMiklos Vajna
Open the DOCX bugdoc from <https://bugs.documentfoundation.org/show_bug.cgi?id=161652#c6>, select-all, copy, open a second document, paste as RTF, paste as RTF again, the number of char styles in the document change from 27 to 55, then to 73. That is surprising, paragraph styles work by first creating them then a 2nd paste just refers to them. It seems what happens is that paragraph styles are handled in StyleSheetTable::ApplyStyleSheetsImpl(), and we have an explicit "When pasting, don't update existing styles" code there; on the other hand ListDef::GetStyleName() solves the style name collision by appending enough "a" characters at the end of the style name till there is no collision. Fix the inconsistency by adapting the list style behavior to match what paragraph styles do: if we don't open an RTF file but paste into an existing document, then try to use the existing style on collision. Fixing this on the RTF producing side would be less effective, also one could argue that in case a numbering uses e.g. 3 levels, then it still makes sense to emit the definition for all levels to help the editor once numbering levels are increased in the pasted content. (cherry picked from commit 8bc8f073e81d125a8e8f1adce966ddb2c7d6bacb) Conflicts: sw/CppunitTest_sw_writerfilter_dmapper.mk writerfilter/qa/cppunittests/dmapper/NumberingManager.cxx writerfilter/qa/cppunittests/dmapper/data/clipboard-bullets.rtf writerfilter/source/dmapper/NumberingManager.cxx Change-Id: Ia211cb4300c3c41dae8c815ddfaf30cc2f0de8b5 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169703 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
13 daysspeed up complex doc with lots of footers/headersNoel Grandin
setting the header/footer property values is extremely expensive, so check before setting them. Shaves 30% off the load time of a large DOCX Change-Id: I7ac61434b8b4f59e199620dfcc11680164efe203 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169532 Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk> Tested-by: Jenkins (cherry picked from commit ccb557636ea87a40d5fcc370b7434029c0153588) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169548 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-06-27tdf#161779 DOCX import, drawingML: fix handling of translation for linesMiklos Vajna
Open the bugdoc, it has a line with a non-zero horizontal offset from the anchor paragraph, it shows up as a horizontal line, while it should be vertical. Checking how the ODT import and the DOCX import works for lines, one obvious difference is that the ODT import at SdXMLLineShapeContext::startFastElement() only considers the size / scaling for the individual points, everything else goes to the transform matrix of the containing shape, set in SdXMLShapeContext::SetTransformation(). The drawingML import is way more complex, but it effectively tries to not set any transformation on the shape and just transorms the points of the line instead. Fix the problem by changing Shape::createAndInsert() to also not put any scaling to the transform matrix, to not transform the points of the line and finally to apply the transform matrix to lines as well. Do this only for toplevel Writer lines, that's enough to fix the bugdoc and group shapes / Calc shapes need more investigation, so leave those unchanged for now. Tests which were failing while working on this change: - CppunitTest_sc_shapetest's testTdf144242_Line_noSwapWH: do this for Writer shapes only, for now - CppunitTest_sw_ooxmlimport's lineRotation: this is already broken partially, now looks perfect - CppunitTest_sw_ooxmlimport's testTdf85232 / group shape: this points out that lines in group shapes are some additional complexity, so leave that case unchanged, for now - CppunitTest_sw_ooxmlexport3's testArrowPosition: manual testing shows this is still OK - CppunitTest_sw_writerfilter_dmapper's testTdf141540GroupLinePosSize: manual testing shows this is still OK (cherry picked from commit 6c09c85ec384e88c89bff0817e7fe9889d7ed68e) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport3.cxx Change-Id: I246430148e3b3c927e010f360fa317e8429c82d2 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169615 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2024-06-26Related: tdf#161652 sw, RTF paste: only keep used paragraph stylesMiklos Vajna
When pasting from old enough Impress that doesn't have commit afb4ea67463d9f0200dc6216cfd932aec0984c82 (tdf#161652 editeng, RTF copy: only write used paragraph styles, 2024-06-20), it still happened that we got many styles from an Impress slide's paragraph in Writer than just the style of that paragraph itself. The problem is that if we want to avoid problems with bad user input, that has to be handled on the RTF paste / import side, not on the producing side. Fix the problem by filtering out unused paragraph styles also on the RTF import (paste) side, in the IsNewDoc() case, which is the clipboard case (not RTF file open). With this, we attempt to filter out not needed paragraph styles both on the import and export side. (cherry picked from commit f58e3e3402c87755a2dd3cb83f29d00c40b94f1a) Change-Id: Ic2c63e5f45245bb4296ec0d1a95558c459667e29 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169511 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2024-06-24tdf#161631 writerfilter: move another member to SubstreamContextMichael Stahl
The problem is that the bugdoc contains a table in the footer, which causes m_bDummyParaAddedForTableInSection to be set, which erroneously causes the last paragraph in the body to be removed. (regression from commit 86ad08f9d25110e91e92a0badf6de75e785b3644) Change-Id: I148785c54c37dc25f7d239b5898aec9fb5455f40 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169191 Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> Tested-by: Jenkins (cherry picked from commit ef77086255821d61838a7e26fee9baadaca0b9e0) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169209
2024-06-24writerfilter: import undocumented STYLEREF field heading switchMichael Stahl
forum-mso-de-86231.docx contains a funny field that uses undocumented switch: StyleRef \2 \n Word can evaluate it and find the paragraph with style "Heading 2". Translate it to "2" in DomainMapper, which is also evaluated by Word. (regression from commit d4fdafa103bfea94a279d7069ddc50ba92f67d01) Change-Id: I587e6df1ea72642278d93723ed6692ff5011ed57 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168495 Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> Tested-by: Jenkins (cherry picked from commit aac625cf1cc502de5d55c0b30afb962147ccf3e1) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168553 Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-06-18tdf#161570 DOCX import: fix lost numbering in paragraph styleMiklos Vajna
Regression from commit ca71482237d31703454062b8b2f544a8bacd2831 (tdf#153083 writerfilter: import locale-dependent TOC \t style names, 2, 2023-01-31), open the doc, apply 'Level 2 Heading' on the first para, then switch back to 'Signature' again using e.g. the sidebar, the numbering of the first paragraph is gone. This was initially a wider problem, but since commit ab1697cb4c17fd7a2fbf8d374ac95fc03b4d00be (tdf#160402 filter,writerfilter: import locale-dependent STYLEREF names, 2024-05-06), the problem only affects built-in styles. There were two remaining problems: 1) the separator for the TOC field can contain whitespace, which resulted in a style named ' Signature' and 2) the style was always cloned, even if the name was not localized. Fix the problem by first trimming the style name in DomainMapper_Impl::handleToc() and then only cloning in DomainMapper_Impl::ConvertTOCStyleName() if we see that the style name is localized. A localized style name can be observed when opening e.g. sw/qa/extras/ooxmlexport/data/custom-styles-TOC-semicolon.docx that has Intensives Zitat vs Intense Quote. One remaining question is why the numbering is lost when the cloning happens, that's not addressed here, as the cloning should not happen for this document in the first place. (cherry picked from commit 4e5dd2c0774242e44ac6edf2bd5ada220541b06b) Change-Id: Ibc2ea20cc3c9ec6bec9bdcdabce1469a0457317a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/169073 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-06-12tdf#161443 DOCX import, table style: handle para border in table cell parasMiklos Vajna
Open the bugdoc, the in-table paragraphs have some top and bottom paragraph borders in Word, not in Writer -- because the cell and paragraph UNO object both have a property named TopBorder as mentioned in commit 39c54c0ef837e0e23a676a4d1fa5da667e18939c (tdf#161443 DOCX import, table style: fix para border leaking into cell border, 2024-06-07). The previous fix avoided the problem that the unwanted border affects, the cell, but re-routing the property to affect the in-table paragraph were not done. Fix the problem by adding 3 new meta-properties with a "Para" prefix for all 4 border locations (top/left/bottom/right), this way the paragraph borders defined in a table style can affect the in-table paragraphs, but not the table cells. Apart from the border itself, this also affected the border spacing, which means that the position of all text inside and below the table is now also correct. Unfortunately this also means we need to move away from the constexpr frozen container that is only suitable for a limited number of items: sw/source/writerfilter/dmapper/PropertyIds.cxx:394:6: error: ‘constexpr’ evaluation operation count exceeds limit of 33554432 (use ‘-fconstexpr-ops-limit=’ to increase the limit) Returning to std::unordered_map is good enough for our needs. (cherry picked from commit 013300c751d7a9ede12c1bf1c784254d1c6c5433) Change-Id: I478f274800a1d0b200f10226438ab4cfd4957b74 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168696 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-06-11tdf#161443 DOCX import, table style: fix para border leaking into cell borderMiklos Vajna
Open the document with a single table, notice that start of the text in A1 cell is missing. Seems what happens is that the cell has some positive border distance, then the para has the same negative left margin, so in total there is 0 left margin for the text, which makes this readable in Word. On our side, we map the paragraph border from the table style to the LeftBorderDistance property, then throw this on the cell object, which gives us 0 border distance, so the negative para left margin results in an unwanted shift of text towards the left: the start of the text is hidden by clipping to make sure the painted text is inside the cell frame. (Both paragraphs and cells have a LeftBorderDistance property, by accident.) Given that a visible paragraph border from table style is not actually imported, first just do the minimal fix and make sure we don't import paragraph borders from table style at all: this solves the problem of unwanted 0 cell border distances and the full text is now readable. In case the paragraph border in the table style would be actually visible, that would be useful to route to the paragraphs in the cell, that's not yet done here. (cherry picked from commit 39c54c0ef837e0e23a676a4d1fa5da667e18939c) Change-Id: I79907a2487c48659effcc55253b9d9881550284d Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168660 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-06-03tdf#161318 sw clearing break: fix this at section endMiklos Vajna
Regression from commit 19bca24486315cc35f873486e6a2dd18394d0614 (tdf#126287: docx import: use defered linebreak, 2022-02-07), the bugdoc has a single paragraph in the first section, containing a clearing break, which is lost. This leads to overlapping text as the text is shifted up. Seems the intention was to avoid a line break at the very end of the document, as that can lead to an empty page with "next page" section breaks, with non-clearing line breaks. Fix the problem by only doing this for non-clearing line breaks: that keeps the old use-case working, but the new, clearing line break then shifts down the text, so no text overlap happens. Switching from appendTextPortion() to HandleLineBreak() helps because HandleLineBreak() does exactly appendTextPortion("\n") in the non-clearing case, but knows about the stream stack's line break clear status. (cherry picked from commit e00479404af5058b982c447e485af995d552e372) Conflicts: writerfilter/qa/cppunittests/dmapper/data/clearing-break-sect-end.docx writerfilter/source/dmapper/DomainMapper.cxx Change-Id: I38868eeeac55e20e86b668e9baf7e0d6a4976608 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168361 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-05-29tdf#160984 sw continuous endnotes: DOCX: import <w:endnotePr> pos == sectEndMiklos Vajna
Word can have per-section endnotes, but if endnotes are collected at the end of the section vs document end is a per-document setting. The DOC import already handles this in wwSectionManager::InsertSection() when it constructs an SwFormatEndAtTextEnd with FTNEND_ATTXTEND. Fix the problem by doing the same in writerfilter: in case settings.xml wants at-section-end endnotes, set EndnoteIsCollectAtTextEnd to true when applying section properties. The export side still needs doing. (cherry picked from commit 2d2dd56e0b2dc708f1f758d7fc9a1263ff09b83c) Conflicts: writerfilter/source/dmapper/PropertyMap.cxx Change-Id: Ibad9c2d62a2945ee42877c849482feee60a50178 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168179 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-05-23tdf#160984 sw continuous endnotes: enable DOCX importMiklos Vajna
This was working for DOC already. For DOCX, this was already enabled once with commit f9982c24066d6dd2f938cc20176af0f196bc018f (tdf#58521 DOCX import: enable ContinuousEndnotes compat flag, 2021-07-13), but then it was reverted later with commit commit eeda1b35a6e87d5349545464da33d997c52f15e3 (Revert "tdf#58521 DOCX import: enable ContinuousEndnotes compat flag", 2021-08-10), because of tdf#143456. Enable it again, now that the section-based layout seems good enough to handle larger number of endnotes, e.g. the 48 endnotes from tdf#143456. (cherry picked from commit 1ae5ea3f78cca11ba18f2dd1a06f875263336a3b) Change-Id: Id221f31f9208e84db2c358546d4d6ceea991b6b3 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167924 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-05-21writerfilter: avoid infinit loop when resolving embeddings on docxJaume Pujantell
If a docx file contains a loop on the .rels files for headers and/or footers the code would enter an infinite recursion while looking for embeddings. Change-Id: I2122fd6b1677812f561e96a9511a61b0e938e94a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167784 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167885 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-05-21writerfilter: fix parsing of invalid STYLEREF fieldMichael Stahl
forum-mso-en-3309.docx contains a funny field that doesn't follow the grammar in the OOXML spec: STYLEREF \t "Heading 1" \* MERGEFORMAT Word can evaluate it and find the paragraph, so make the parser a bit more flexible, by adding known switches that don't have arguments, so that any argument following these becomes a field argument, for now only for STYLEREF. (regression from commit d4fdafa103bfea94a279d7069ddc50ba92f67d01) Change-Id: Ic42cd2be58fd65a817946e21a9661d357b02a99a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167697 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 5ae1379fcdd00228e683ae90991e275f570cd92d) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167733 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com> (cherry picked from commit d632d86579467941ce8b3dda1dbd46c83a92877a) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167882 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-05-14tdf#160402 writerfilter: extend StyleMap with all Word stylesMichael Stahl
There doesn't appear to be an accurate and complete documentation of all the Word built-in style names, but fortunately Word writes them all into styles.xml in a w:latentStyles element anyway. It turned out that a lot of the Writer built-in style names here were obsoleted by renaming and did not match any more. (cherry picked from commit 331da18872d8dd526b0e91854450223ee8c0bf0c) Change-Id: Ic69785a34524f667b83a06a267715b2c8b0165d0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167615 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-05-13tdf#160402 writerfilter,sw: STYLEREF field can refer to character styleMichael Stahl
Adapt SwGetRefFieldType::FindAnchor() to search for SwTextCharFormat, and ApplyClonedTOCStylesToXText() to replace "CharStyleName". Works for the "Intensive Hervorhebung" field in bugdoc. (cherry picked from commit 74f859b5525da0760a70ab660bd912dabfd608ca) Change-Id: Iee126eeb4cc2ff1c570941e3beefd93527c56fee Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167564 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-05-10tdf#160402 filter,writerfilter: import locale-dependent STYLEREF namesMichael Stahl
* Handle STYLEREF style the same way as TOC style on import. * Word 2013 does not uppercase the first letter ("Überschrift 1" -> "berschrift1") and there doesn't appear to be any justification why the code does that. * The style that's applied is actually the display style name so convert from source's m_sConvertedStyleName. * Change the logic in DomainMapper_Impl::ConvertTOCStyleName() to explicitly check and clone only known built-in Word styles, which are the ones that are translated. * This requires some refactoring, and to add the built-in styles in the bugdoc to the "StyleNameMap", which is probably still incomplete... * Exporting to DOCX appears to work without changes. * Somehow this causes the testFDO77715 to have an outlinelevel of 1 on the TOC paragraphs now, but that turns out to be a bugfix. (cherry picked from commit e33f2b3d550d5d26a5eca135d88e0c16b2db11d8) Change-Id: I73bc1d1819e5cecbba2fef9cd6d290682a02a638 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167429 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-05-03tdf#160833 DOCX import: use the DoNotMirrorRtlDrawObjs compat flagMiklos Vajna
The bugdoc has a shape which should be on the right page margin, but it was on the left page margin. Use the new compat flag to have a layout that matches Word. This way we don't need to unmap the tweaked position at export time (a limitation that the DOC filter has). (cherry picked from commit 01dcc9a652ecfc65fb674b492afa6f58b0a846db) Conflicts: sw/Module_sw.mk Change-Id: I38dfae370f275d9f0897198e7b0569f2d91dd352 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/167033 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-04-15tdf#88214 sw: text formatting: adapt empty line at end of para to WordMichael Stahl
For an empty line at the end of an empty paragraph, Writer already uses any existing text attribute in the paragraph, see for example testEmptyTrailingSpans. For an empty line at the end of a non-empty paragraph, Writer text formatting uses only paragraph attributes, ignoring any text attributes, whereas the UI will display the attributes from the text attributes (such as font height) if you move the cursor there. Word uses text attributes also in this case, so adapt the inconsistent Writer behaviour: text formatting now uses text attributes too. Apparently this can be achieved by calling SeekAndChgBefore() instead of SeekAndChg(). Add another compat flag "ApplyTextAttrToEmptyLineAtEndOfParagraph" to preserve the formatting of existing ODF documents. Adapt test document fdo74110.docx, it has a line break with "Angsana New" font. Change-Id: I0863d3077e419404194b47110e4ad2bdda3d11c4 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165887 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 2b47fae7e3e23ee7c733708500cb0482ad7f8af1) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165906 Reviewed-by: Thorsten Behrens <thorsten.behrens@allotropia.de>
2024-04-15tdf#158556 speedup docx loadNoel Grandin
If we want to know if an XText is a header/footer object, no need to loop over the draw objects, we can just query its service name. Reduces load time from 167s to 97s Change-Id: I10158c11dd24c4945b3ea6cfed4916717bd4f2f8 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165269 Tested-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk> (cherry picked from commit ab29c857c669bcca3d8eea8a5a9e6ad5eae622d7) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165618 Tested-by: Jenkins Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org>
2024-04-15cid#1546354 COPY_INSTEAD_OF_MOVECaolán McNamara
and cid#1546319 COPY_INSTEAD_OF_MOVE cid#1546286 COPY_INSTEAD_OF_MOVE cid#1546283 COPY_INSTEAD_OF_MOVE cid#1546191 COPY_INSTEAD_OF_MOVE cid#1545953 COPY_INSTEAD_OF_MOVE cid#1545874 COPY_INSTEAD_OF_MOVE cid#1545857 COPY_INSTEAD_OF_MOVE cid#1545781 COPY_INSTEAD_OF_MOVE cid#1545765 COPY_INSTEAD_OF_MOVE cid#1545546 COPY_INSTEAD_OF_MOVE cid#1545338 COPY_INSTEAD_OF_MOVE cid#1545190 COPY_INSTEAD_OF_MOVE cid#1545272 COPY_INSTEAD_OF_MOVE cid#1545242 COPY_INSTEAD_OF_MOVE cid#1545229 COPY_INSTEAD_OF_MOVE Change-Id: I88813d9dbd87ce10375db8198028f8b70e23f0fa Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162027 Tested-by: Caolán McNamara <caolan.mcnamara@collabora.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-03-27Revert "tdf#159730 add compatibility option in RTF import"Oliver Specht
This reverts commit 3b04e74503ec6d07dc4befdb756e6abdc3de4e58. Reason for revert: The compatibility option is the wrong approach. This results in wrong line calculation as seen in tdf#159730#c6. The problem that really needs to be fixed is the 9pt attribute of the hidden line breaks in the first paragraph that are used to calculate the height of the first paragraph. Only the 1pt font attribute of the remaining visible space should define the line height. Change-Id: I6e0a1a499adaf2df9f68afbcfd6afcd6677e8f76 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165006 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 44e4ada23dfc8655ec7ddccfd027f02d22684d60) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165118 Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org>
2024-03-26Relatted: tdf#160139 RTF paste: don't turn off headers/footersMiklos Vajna
Regression from commit d918beda2ab42668014b0dd42996b6ccc97e8c3a (tdf#158814 DOCX import: fix unwanted header with type="first" & no titlePg, 2024-02-05), pasting shape text into the body text of Writer turned off the header, which was not intentional. The original use-case was DOCX/RTF import, and the paste case was just not considered. Fix the problem by leaving the paste alone: we already omit a number of actions in this case (e.g. not overwrite styles), don't turn off headers, either. Note that the original problem is wider: we would probably need to track what page styles are created and only touch those, or something similar. Change-Id: If08fa7956e98766d5807332c5c0baa25b46afe38 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165191 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins (cherry picked from commit c900850742efd4e1fb7c79c13c1b9a17fcd4981d) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165185 Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org> (cherry picked from commit 2e31c7cd2976be3d43b0845e50d0bb4ca7e50179) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/165309 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-03-17tdf#142133: partially revert 576611895e5Xisco Fauli
if 'Internet Link' character style doens't exist then apply the hyperlink style This also reverts 023285158bde72dcd73b965ce205cf8550e7a5e2 "tdf#128504 save DOCX as ODT: don't color not highlighted hyperlinks" which is no longer necessary Change-Id: Id100af5fddb10745af9d56c0ba75cb2366ecbe55 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164576 Tested-by: Jenkins Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org> (cherry picked from commit 03ca7031f3bf4c2a3e841b18c8f9e00004046098) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164509 Reviewed-by: Stéphane Guillou <stephane.guillou@libreoffice.org>
2024-03-11tdf#160077 writerfilter: shape vertRelation is FRAME for layoutinCellJustin Luth
When layoutInCell is active, then the offset must be applied against the paragraph instead of the page or page margins. There were only two unit tests that matched this, and both were off-sheet positioned. -tdf151704_thinColumnHeight.docx -tdf92157.docx make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf160077_layoutInCell Change-Id: I28241136c0c0be12d3f2dd876550ecdf91b0009c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164514 Tested-by: Jenkins Reviewed-by: Justin Luth <jluth@mail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164585 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-03-11tdf#160049 dml shape import: use margins with left/right HoriOrientRelJustin Luth
make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf160049_anchorMargin Change-Id: I3e2df2037cabfedbb6df6b8c8257e90baeaab96e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164445 Tested-by: Jenkins Reviewed-by: Justin Luth <jluth@mail.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164584 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-03-11tdf#160049 dml import: use margins with left/right HoriOrientRelationJustin Luth
I'm really surprised this wasn't found much earlier. Even DOC format isn't handling this. compat15 gets rid of this inconsistency. Surprisingly, there were no interesting unit tests matching this. make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf160049_anchorMarginVML make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf160049_anchorMargin15 Change-Id: Ic5c316569ad3640ba0e786d39a6e5c006c74d665 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164443 Tested-by: Jenkins Reviewed-by: Justin Luth <jluth@mail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164582 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-03-07tdf#153196 writerfilter: fix page style for even/odd section breakMichael Stahl
This is a bit of a special case, where the first section starts with an evenPage break (\sbkeven), which causes a Left-only page style to be created. In completeCopyHeaderFooter(), the HeaderTextFirst and FooterTextFirst are copied from the source style to the Left-only page style, but then they also need to be deleted from the source style, because the Left-only page style is the one that is used for the first page of the section, and the source style is used for the subsequent pages. Additionally, when there is *only* a "first" header/footer, and no previous section has one to inherit, Word will not display a header/footer at all on subsequent pages; a PageStyle will always have a header/footer if it has a HeaderTextFirst/FooterTextFirst. In this case, delete the header/footer from the source style. Unfortunately exporting this doesn't work ideally, a spurious evenPage footer will be created, both due to the FooterShare being automatically reset for no obvious reason in ItemSetToPageDesc(), and setProperty("FooterIsShared", true) "stashing" the left footer since commit b802ab694a8a7357d4657f3e11b571144fa7c7bf. (presumably regression from commit b32881b6723072c8d1a652ea147d12e75766d504) Change-Id: Ie4f9c49605df690e9705e14777c0e4bcb0dfad8e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163668 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 340f8ea4ae7f11b4d3a95499188a29fe801867cf) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163673 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-03-07tdf#158597 writerfilter,sw: fix toggle properties in ListAutoFormatMichael Stahl
... for DOCX import. These can be set both via paragraph style and via character style in the w:pPr/w:rPr, so use the applyToggleAttributes(). Adding a test for this requires adding the "CharStyleName" property to GetAutoCharStylePropertyMap(). Change-Id: I9701d5ac82ec3e7757650c08861791dc398a1a77 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163386 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit ff9be3fd30ead41359734f9281b034a988d71196) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163449 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-03-07tdf#158360 - sw, ooxml import - fix insertTextPortion crashBalazs Varga
Before this patch da8dead8e9282010893cbd12519e107baf03cd1a SvxUnoTextBase::insertTextPortion returned an empty XTextRange in case of texts in comment. (SwTextAPIObject) Lets use finishParagraphInsert which also give back an empty XTextRange. regression from commit: da8dead8e9282010893cbd12519e107baf03cd1a (tdf#73537 - sc: show author and creation date in calc comments) Change-Id: I0b33e5b3cae32718a62a7be04b9a88562f85652c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163670 Tested-by: Jenkins Reviewed-by: Balazs Varga <balazs.varga.extern@allotropia.de> (cherry picked from commit 7cf3d5e3073dc5cffc64b6d9b32513e90087a3d4) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163626 Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-03-04Related: tdf#158986 sw floattable, RTF import: use more setNeedPar()Miklos Vajna
See <https://gerrit.libreoffice.org/c/core/+/163844/1#message-ea0bfde78fa24ad83e5c153ecaddbf897a89f547>, this keeps the bug fixed but is a better version, as pointed out by Michael S: > there was a bug where dispatching PAR here caused a deferred page > break to be lost, which was fixed by calling setNeedPar(true) instead. (cherry picked from commit c98ff922831f56253af2a050b8e07cfc89b7a387) Change-Id: Ibe6e4c14286d40d3066ce9cb7fac9f6847fb81dd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164095 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit ab51c235672dd6da2feafbe3f26625ee14889bf9) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164362 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-03-04Related: tdf#150408 RTF filter: handle legal numberingMiklos Vajna
The bugdoc's 2nd para started with 'Sect I.01', while Word rendered this as 'Sect 1.01'. The reason for this difference is that there is an "is legal" boolean property on the numbering that we ignored from RTF during import/export. Fix the problem by extending RTFDocumentImpl::dispatchTableSprmValue() for the numbering table import + RtfAttributeOutput::NumberingLevel() for the export. The import default for this value was also wrong, given that the default is to enable it when the control word is present. (cherry picked from commit e8487bedb20a429565b4a0e4bd2d6806cc603b7f) Conflicts: sw/qa/extras/rtfexport/rtfexport3.cxx Change-Id: I4dcd23768000ba29d4df314b475b412bb371545e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164327 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-03-01tdf#158986 sw floattable: fix RTF import of table followed by \sectMiklos Vajna
The bugdoc had a floating table, immediately followed by a section break. The floating table went to the second page, should be on the first page. The trouble is that RTF's section break is just a special symbol, so we can have a section break right after a floating table. This is in constrast with DOCX where a non-last section break is a paragraph property, so it's guaranteed to have at least a paragraph start after a floating table and before a section break, which can nicely serve as an anchor point for the floating table. Fix the problem similar to what the OOXML tokenizer did in a similar case in commit 01ad8ec4bb5425446e95dbada81de435646824b4 (sw floattable: fix lost tables around a floating table from DOCX, 2023-06-05), by injecting a paragraph before the section break. Handling this at a tokenizer level seems to be the right place, since the DOCX version of the same document was already imported OK. Change-Id: Ic945c472c08ba872a5c46e2b8f75e919678aa0a0 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163929 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins (cherry picked from commit b7c4c4d45f44a26283678f3dc32982b3a728c614) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163844 Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 0ed1604de9448e0898a6894d9a8262272aea4766) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164212 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2024-02-26related tdf#159824 RTF import/export gradient angleJustin Luth
The fillAngle is important for obvious visual reasons, but also significantly because a negative angle means that the start/end colors should be swapped (which is the normal case since LO's 0 degree angle == -180 VML/RTF angle). There were no existing unit tests with a "fillAngle" specified, or with a non-180 angle (0 VML/RTF angle) in LO. make CppunitTest_sw_rtfexport8 \ CPPUNIT_TEST_NAME=testTdf159824_gradientAngle1 make CppunitTest_sw_rtfexport8 \ CPPUNIT_TEST_NAME=testTdf159824_gradientAngle2 make CppunitTest_sw_rtfexport8 \ CPPUNIT_TEST_NAME=testTdf159824_axialGradient Change-Id: I4bb2c47bd2a79833d11bedac72ba2152b65b7c73 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163714 Tested-by: Jenkins Reviewed-by: Justin Luth <jluth@mail.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163870 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-02-26tdf#126533 docx import: page background vml fillJustin Luth
This patch imports bitmaps/tiled textures (primarily), but also somewhat for gradients (because of a gradient2 -> gradient mismatch somewhere) and somewhat for patterns (because patterns are not well imported in general). Note that the imported fill likely will NOT match MSO, because their background CHANGES BASED ON THE ZOOM LEVEL. For example, my primary testing file (A6 landscape) has a logo which is only 25% visible in Word 2003 at 100%, but shows 90% of the logo at 200%, and many tiles of logos when exported as PDF. The same is true for gradients etc. Changing background on zoom is an absolutely bizarre implementation, and naturally LO could only accidentally look identical (and should never try to do so). make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf126533_noPageBitmap make CppunitTest_sw_ooxmlexport21 \ CPPUNIT_TEST_NAME=testTdf126533_pageGradient This is slightly ugly, but I don't know how to make a COPY of the XPropertySet UNO junk. All I have is references, and dispose deletes everything, even the references. I took some inspiration from RTF which just disposes the shape after grabbing the background color. Thus, just change the page style known to exist and be used, and then simply remove the fill if it isn't needed in the end. Any new page styles can just copy the default page style fill. Change-Id: Id3ea002c685642ff4c289982d0108247a6e9bb8d Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162958 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163861 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Justin Luth <jluth@mail.com>
2024-02-21tdf#159730 add compatibility option in RTF importOliver Specht
Set IgnoreTabsAndBlanksForLineCalculation in RTF import to improve formatting. Change-Id: If0129f748c48400f1dd882672b5779f62e685ecd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163429 Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> Tested-by: Jenkins (cherry picked from commit 3b04e74503ec6d07dc4befdb756e6abdc3de4e58) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163614
2024-02-21tdf#159815 DOCX import: fix redlined to-char image followed by inline SDTMiklos Vajna
The 2nd para of the bugdoc has a leading word, then a redlined anchored shape, finally a redlined checkbox content control. The Writer import result's content control starts at para start, while it should only start after the anchored shape. What happens is that writerfilter::dmapper::DomainMapper_Impl::PushSdt() gets called to remember the SDT start, then DomainMapper_Impl::applyToggleAttributes() deletes some content since commit 8726cf692299ea262a7455adcf6ec25451c7869d (tdf#142700 DOCX: fix lost track changes of images, 2021-07-08), which invalidates the SDT start position, finally PopSdt gets called, but that fails because our start position is no longer valid. This used to abort the import process. Since commit 1b0f67018fa1d514ebca59e081efdd24c1d7811b (docx import: correct redline content-controls, 2024-02-20), the import finishes the but start of the SDT is incorrect: it's at the para start, while it should start in the middle of the paragraph. The direct reason for the invalidation is a call to SwXTextRange::Impl::Notify(), which is from a content deletion from applyToggleAttributes(). Fix the problem by not deleting content when we're past PushSdt() and we haven't called PopSdt(): extract the redlined anchored shape code into a new DomainMapper_Impl::MergeAtContentImageRedlineWithNext() and call that also in DomainMapper_Impl::PushSdt() early, so it won't be called when we're in the middle of an SDT. This way createTextCursorByRange() should not fail, so warn in case it still fails in DomainMapper_Impl::PopSdt(). (cherry picked from commit 833abb4a197561c34ec59cceb9d7d8a46f6b17ce) Change-Id: Ic4198804a92088ec268203d44c0da2d6997754b7 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163694 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-02-21docx import: correct redline content-controlsAshod Nakashian
When inserting and deleting content-controls with change-tracking enabled, we hit a few corner-cases that we need to handle more smartly. First, we shouldn't redline the controls themselves, just the placeholder text. Second, we have to take special care to create valid XML structure with the redline tags. Includes unit-test that reproduces the issues and verifies that both saving and loading work as expected. (cherry picked from commit 1b0f67018fa1d514ebca59e081efdd24c1d7811b) Change-Id: I6af4d0d2c3f0661e7990d5414cc93effc96f0469 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163681 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-02-16tdf#155663 writerfilter: RTF import: don't lose \piccrop*Michael Stahl
For DOCX the a:srcRect is imported in oox module in BlipFillContext and set on the XShape; obviously that doesn't work for RTF. The crop was already taken into account in RTFDocumentImpl::resolvePict(), but only to set the size of the picture; to actually set a crop effect, set shape's "GraphicsCrop" property in dmapper::GraphicImport::lcl_attribute(). Change-Id: Ib12853724744542a09b0073fefc42ad32bb2ff19 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163310 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 47f50af3f057bac1739b7d17d781c0b1d05faa95) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163330 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-16tdf#158983 writerfilter: RTF import: fix page breaks and shape anchorsMichael Stahl
Somehow, not sure why, the added import of \wrapdefault in commit 86c0f58b6f9f392865196606173d1b98a6897f32 caused the page break in fdo55504-1.rtf to get lost. The first problem is that there is a \sbknone before the first \sect - this should not have any effect before \sect because \sbk* affect the *previous* section break, but it's not an option to simply ignore it (even if it works for this bugdoc) because it may be that there is no \sectd after \sect and then it will have an effect for the later section. The problem was in handling \page: here the premature \sbknone caused a sectBreak() which ate the page break; ignore it here by checking m_bHadSect. The second problem then was that now all but the first shape were anchored on page 2. This was because RTFDocumentImpl::beforePopState() for \shape of the 1st shape called parBreak() and that set the bIsFirstParaInSection flag. This flag prevented DomainMapper::lcl_utext() in the "if (m_pImpl->isBreakDeferred(PAGE_BREAK)) if (GetSplitPgBreakAndParaMark())" branch from inserting another paragraph break that is necessary to preserve the already inserted shapes anchored to the 2nd paragraph on page 1. (This is how it works for the equivalent DOCX document, with settings.xml edited to add w:splitPgBreakAndParaMark and remove "compatibilityMode" etc. because Word 2013 doesn't set these correctly.) The consequence is that when the second SwTextNode is converted to a text frame, all the shape anchors move to the next paragraph, the one with the RES_BREAK on it. Fix this by limiting the parBreak() handling in RTFDocumentImpl::beforePopState() to when the shape is a SwGrfNode, which is the scenario in the commit 0d9132c5046e15540abc20e45d64080708626441 "fdo#47036 fix RTF import of shapes inside text frames at the start of the doc" - the testFdo47036 fails if the block is removed completely. This caused 2 test failures, but both cases look the same as in Word 2013 now: Test name: (anonymous namespace)::testTdf158826_extraCR::Load_Verify_Reload_Verify An uncaught exception of type com.sun.star.uno.RuntimeException - unsatisfied query for interface of type com.sun.star.text.XTextTable! rtfexport2.cxx:537:Assertion Test name: (anonymous namespace)::testFdo47495::Load_Verify_Reload_Verify equality assertion failed - Expected: 2 - Actual : 1 Change-Id: I43fa9431721650a6d748d1f4bda9aeaa7a9c6b45 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163200 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 582ef812702413dbe7fb0f132bca3e3e4c2e1d40) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163181 Reviewed-by: Xisco Fauli <xiscofauli@libreoffice.org>
2024-02-14tdf#159453 sw floattable: fix unexpected overlap of in-header fly and body textMiklos Vajna
Regression from commit e2076cf7a92694bc94bdc9f3173c2bddbe881a89 (tdf#155682 sw floattable: fix DOCX with big pictures causes endless loop, 2023-10-25), the bugdoc's body text was wrapping around the floating table from the header, while the expectation was that the top of the body frame is below the bottom of the header frame. It seems IsFollowingTextFlow is only needed when the relation of the floating table is not "page", and this bugdoc has has an examplicit vertical relation of page. Solve the problem by limiting the IsFollowingTextFlow=true request for the floating table to the VertOrientRelation=page case, which fixes the bugdoc and keeps the old use-case working. The doc model for the new bugdoc now matches the WW8 import result. (cherry picked from commit f74a6ef94ac957e4c146fc9923d30ce6bd31b5ce) Change-Id: Ia3da65cd52d70b357e448a26a50ffb92a39795e6 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163352 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2024-02-09tdf#158586 writerfilter: RTF import: fix assert on ooo113308-1.rtfMichael Stahl
warn:legacy.osl:::writerfilter/source/dmapper/DomainMapper_Impl.cxx:1278: section stack already empty DomainMapper_Impl.cxx:9817: void writerfilter::dmapper::DomainMapper_Impl::substream(): Assertion `m_aContextStack.size() == contextSize' failed. Before substream(), there is one CONTEXT_SECTION, after there is an additional CONTEXT_PARAGRPAH. The first OSL_ENSURE is because RTFDocumentImpl::tableBreak() calls endParagraphGroup() but in the substream, startParagraphGroup() hadn't been called; fixing this also makes the assert failure go away. This worked previously because sectBreak() called endParagraphGroup() after reading the header substreams, but it seems dubious that a paragraph group started in the body should be used in the substream. (regression from commit 57abad5cf990111fd7de011809d4421dc0550193) Change-Id: I98864bca03b59099c17080c0a7582de2b77d41e1 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163096 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 6446d3e12440be39e6b55f8749038061a1b240da) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163137 Reviewed-by: Christian Lohmaier <lohmaier+LibreOffice@googlemail.com>
2024-02-09tdf#159478 read field comment in default encodingOliver Specht
If a symbol font is applied inside a field the command string was wrongly converted as symbol text. This is fixed by using a default RTL_TEXTENCODING_MS_1252 encoding. Change-Id: I11326ef3c79d6d74c720a2b4ac4987ee6716d912 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162844 Tested-by: Jenkins Tested-by: Gabor Kelemen <gabor.kelemen.extern@allotropia.de> Reviewed-by: Thorsten Behrens <thorsten.behrens@allotropia.de> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163120
2024-02-09tdf#159254 import paper bin/paper source from rtf/docx filesOliver Specht
Imports \binfsxn and \binsxn from RTF and w:paperSrc from docx files and applies paper tray to the page style if the printer supports the imported tray value. Works only on Windows. Change-Id: Ie1170c58f7114f0dbf6bdd2721d4e077886cbe16 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162236 Tested-by: Jenkins Tested-by: Gabor Kelemen <gabor.kelemen.extern@allotropia.de> Reviewed-by: Thorsten Behrens <thorsten.behrens@allotropia.de> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163119
2024-02-07writerfilter: fix missing paragraph break on tdf136445-1.rtfMichael Stahl
This causes an assert: crossrefbookmark.cxx:44: sw::mark::CrossRefBookmark::CrossRefBookmark(): Assertion `IDocumentMarkAccess::IsLegalPaMForCrossRefHeadingBookmark(rPaM) && "<CrossRefBookmark::CrossRefBookmark(..)>" "- creation of cross-reference bookmark with an illegal PaM that does not expand over exactly one whole paragraph."' failed. The problem is that there is an annotation at the end of a paragraph, and reading the annotation changes various members of DomainMapper_Impl, in particular m_bParaSectpr and m_bParaChanged, causing "bRemove" in DomainMapper::lcl_utext() to be erroneously true, removing the just inserted paragraph break again. Move all flags that are evaluated for bRemove to SubstreamContext. This causes one test failure, but it turns out that the new result is the same as in Word 2013. Test name: (anonymous namespace)::testTdf108947::TestBody equality assertion failed - Expected: Header Page 2 ? - Actual : Header Page 2 ? (regression from commit 15b886f460919ea3dce425a621dc017c2992a96b) Change-Id: I44a7a8928ab04c600d4d3c43bc4e4deeafe57d89 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162932 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 86ad08f9d25110e91e92a0badf6de75e785b3644) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162936 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-07writerfilter: move members to SubstreamContextMichael Stahl
writerfilter: move m_bParaHadField to SubstreamContext Change-Id: Ie15e35d304a423bfa3d7b7ead71015d5ec1228d4 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162839 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 4913812baeabd44b46302e54b73a227e760c688a) writerfilter: use SubstreamContext for all substreams <vmiklos> possibly just nobody needed that so far. could be some more general SubstreamContext, i don't see an obvious problem reusing that at more places. Change-Id: If0749155452f65f8dfc4ac2b10f91bb8e48a6b2b Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162840 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 95b01848b18283fd2f903c982108ccdb8efee022) writerfilter: move m_bFirstParagraphInCell to SubstreamContext This is a change to set it for all substreams. Change-Id: I44ed9a5485000f40f8ccfe3ec885ef8f05f5aab2 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162841 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 30323c813977eb0127251848fecd2532dce75749) writerfilter: replace members w/ SubstreamContext::eSubstreamType This should not change any behaviour. Change-Id: Ic970f0e1b6401119d875c9e811589b9c210e0c34 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162842 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 992f7114ab8645fb5b7a22b5f974a95fe7be7712) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162933 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-07tdf#158586 writerfilter: RTF import: fix \page \sect \skbnone w/ headerMichael Stahl
The problem was not fixed yet for the less-minimized bugzilla attachment where the sections contain headers and footers. What happened there is that first \page caused a deferred page break, then \sect and sectBreak() delayed-read the header substream and the \par in the header resets all the deferred break flags. Add the deferred break to an already existing Context class, and remove the direct members in DomainMapper_Impl in favor of always using the m_StreamStateStack. Probably this problem cannot occur for DOCX import, because it imports header/footer eagerly where the reference element is, and sectPr is before any runs that contain breaks in the same paragraph element. Change-Id: Iba971955e9cf0c398d416518e72d99307d3e1cfd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162833 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 17e2c7226a73675d69febf0915aaeae61ad8e9f1) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162823 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-07tdf#158586 writerfilter: RTF import: handle \sect in frame as \parMichael Stahl
This fixes the test testTdf158586_0 and testTdf158586_0B to look like in Word; the case appears a bit esoteric, hopefully Word won't actually create such documents? But Word will round-trip such bugdoc to a DOCX where the first w:p contains all of w:framePr and w:sectPr and w:br... Change-Id: I6ec09478a774e1e9c785e9482618c1afc388df0e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162778 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit dbe78489e98d565b72a703524308523135ffdd67) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162822 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-07tdf#158586 writerfilter: RTF import: fix \page \sect \skbnoneMichael Stahl
The problem is that \page is actually completely ignored in the bugdoc testTdf158586_1. If you delete the \sbknone then there is a page break but it's caused by \sect; the \page is still ignored. It is ignored because, first, the \page results in a deferred break in DomainMapper, then for \sect, the synthetic \par is dispatched and that moves the break from deferred to the top paragraph properties context, then sectBreak() calls endParagraphGroup() which just removes the top paragraph properties context. Remove the dispatchSymbol(RTFKeyword::PAR) for \sect, instead set a flag so that RTFDocumentImpl::sectBreak() does it. Add a new flag m_bParAtEndOfSection so that RTFDocumentImpl::parBreak() can suppress the startParagraphGroup(), so the deferred break remains present. This also fixes testTdf158586_lostFrame. (regression from commit 15b886f460919ea3dce425a621dc017c2992a96b) Change-Id: I82a00899a9448069832a0b2f98a96df00da75518 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162770 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> (cherry picked from commit 57abad5cf990111fd7de011809d4421dc0550193) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/162821 Reviewed-by: Caolán McNamara <caolan.mcnamara@collabora.com>
2024-02-07tdf#158409: RTF import: use current run props for fieldsVasily Melenchuk
Fields import was not using current run properties causing fallback to used style or default style what is wrong for RTF. Change-Id: I0189c6122b703a23ff910ee38da78aa05ac4d9f8 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/160387 Tested-by: Jenkins Reviewed-by: Vasily Melenchuk <vasily.melenchuk@cib.de> Signed-off-by: Xisco Fauli <xiscofauli@libreoffice.org> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/163057