summaryrefslogtreecommitdiff
path: root/writerfilter
AgeCommit message (Collapse)Author
2020-02-24sw: add DOCX import for semi-transparent textMiklos Vajna
This is one of the text effects properties, which is already grab-bagged. That is done in a generic way, so the easiest is to just read the value from the grab-bag, rather than from the dmapper tokens directly. (cherry picked from commit 3a749d7278bbe65cfc063e64460df8af6bc2af47) Conflicts: writerfilter/CppunitTest_writerfilter_dmapper.mk writerfilter/source/dmapper/PropertyIds.cxx writerfilter/source/dmapper/PropertyIds.hxx Change-Id: Id74a3eb0abddf745a9e4e59625bf9345f7df9dfe Reviewed-on: https://gerrit.libreoffice.org/c/core/+/89220 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-18DOCX import: don't touch the ZOrder of inline, in-shape objectsMiklos Vajna
1) This is not needed: Word only supports inline "anchoring" in textboxes. 2) If the textbox has a certain ZOrder, we don't want the inline shapes to be behind the textbox. 3) This allows restoring the old assert in sw_ooxmlexport7 that was changed in commit 99847d6b3005c5444ed5a46ca578c0e40149d77c (DOCX import: fix ZOrder of inline vs anchored shapes, 2020-02-12). (cherry picked from commit 70ae12fe0b9e33633fc62cf805c261ef51fb4b59) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport7.cxx Change-Id: I817e4fb70cb789e8eb116219050fc1aeaec76667 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/88944 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-13DOCX import: fix ZOrder of inline vs anchored shapesMiklos Vajna
Shapes which are anchored but are not in the background should be always on top of as-char anchored shapes in OOXML terms. Writer supports a custom ZOrder even for as-char shapes, so make sure that they are always behind anchored shapes. To avoid unnecessary work, make sure that when there are multiple inline shapes, we don't pointlessly reorder them (the old vs new style of the sorting controls exactly this, what happens when two shapes have the same ZOrder, and all inline shapes have a 0 ZOrder). Adapt a few tests that used ZOrder indexes to access shapes, but the intention was to just refer to a shape: fix the index and migrate to shape names where possible. (cherry picked from commit 99847d6b3005c5444ed5a46ca578c0e40149d77c) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport13.cxx sw/qa/extras/ooxmlexport/ooxmlexport4.cxx sw/qa/extras/ooxmlexport/ooxmlexport7.cxx Change-Id: I670e4dc2acbd2a0c6d47fe964cb5e3f2300e6848 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/88590 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-11DOCX import: fix margins of inline shape with effects, imported as Draw shapeMiklos Vajna
Effects have an extent, and unhandled effects (like this blurred shadow) need to take space in the margin of the shape to make sure they use the correct amount of space in the layout. This was working in general, but not in case the importer decided to import the shape as Draw shape + the shape was inline. (cherry picked from commit bf25e69f8f657d5e3bcdd0bd54c5fa0d66ec85fe) Conflicts: writerfilter/qa/cppunittests/dmapper/GraphicImport.cxx Change-Id: I9d0531d9393d8c2cd274e6f54bbbfe8024bf270f Reviewed-on: https://gerrit.libreoffice.org/c/core/+/88467 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-06DOCX import: don't give up on floating tables in headers completelyMiklos Vajna
This reverts commit 213d6390a2cc59d174173f4359c161625a9c4bdc (tdf#108272 DOCX table-only header: fix SAX parser error, 2020-02-03), except its testcase and replaces it with a better fix that does not import all floating-table-in-header as non-floating tables. See the new testcase, which is 1 pages in Word, it was 3 pages in Writer, and with the better fix it's now 1 pages in Writer as well. (cherry picked from commit 3d6a5de8f4579187e5949b212e4625773bb20e6f) Conflicts: writerfilter/CppunitTest_writerfilter_dmapper.mk Change-Id: Ica3500120f12222d7cf766d55c17d78164865026 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/88086 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-06tdf#108272 DOCX table-only header: fix SAX parser errorLászló Németh
Floating tables in table-only headers are imported as non-floating ones after a SAX parser error. Now we import them as non-floating ones from the beginning to avoid of the parser error. Change-Id: I0a816a7af642f402a25ed53d9766b1e8b82db789 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/87874 Tested-by: Jenkins Reviewed-by: László Németh <nemeth@numbertext.org> (cherry picked from commit 213d6390a2cc59d174173f4359c161625a9c4bdc) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/88085 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-02-04tdf#130214: workaround invalid state resulting from error on importMike Kaganski
Obviously the real error is somewhere else, which results in tdf#126435, and produces unexpected state with missing text append context on stack. This is just a hack to avoid crash. Change-Id: I420ac3b74f5efb9688dc764ac2ad0dcc974ba0e1 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/87595 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/87952 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-16DOCX import: improve support for headers/footers from cont sect breakMiklos Vajna
A continuous section break in a DOCX file can set new headers/footers which start appearing from the next page. In general, this is not something Writer supports, but commit 08f13ab85b5c65b5dc8adfa15918fb3e426fcc3c (tdf#112202 writerfilter,sw: fix loss of headers, 2019-12-16) improved the situation significantly here. Build on top of that and add support for a few more cases: 1) When checking for the last paragraph of the previous section, detect when that last paragraph is so small so in practice that's not the last one. 2) Handle when the text node in question has no explicit break type set, only a SwFormatPageDesc (which implicitly causes a page break). 3) When setting the page style to show the correct header/footer, don't set the page style on the previous paragraph directly, rather update the "next style" of the already set page style. (This is safe, as we never reuse the same Writer page style for multiple Word sections.) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport14.cxx (cherry picked from commit 9a50e8f2bcde95233e4b38707c521ada90fcd6af) Change-Id: Iede196b864af5123c5637f82432ed6e0f7e7241a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86915 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-16tdf#112201 writerfilter: try to apply continuous section page style...Michael Stahl
... on the last node of the previous section. This works for this particular document, but it's quite dubious that it will work in the general case; feel free to revert this if it causes problems. Change-Id: Ia03d41a1127df505c4e9da7131323b70d88a285f Reviewed-on: https://gerrit.libreoffice.org/85294 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@cib.de> (cherry picked from commit 3f680aef65a158cfbc98c8afd1c3628d7f4f7b83) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86913 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-16tdf#112201 writerfilter: continuous sections:Michael Stahl
always replace break with follow-page-style, not first-page-style. It looks like Word ignores <w:titlePg> on continuous section breaks, unless it's the first section, which was already handled by code above. Change-Id: If7c0fe96a1789f64f1943ece453db3dbc284ca48 Reviewed-on: https://gerrit.libreoffice.org/85293 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@cib.de> (cherry picked from commit d7e9daf2d21d3bcafaa6aae4aed6c9df5e0999c4) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86912 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-16tdf#112202 writerfilter,sw: fix loss of headersMichael Stahl
There are several problems here: * CloseSectionGroup() is not only called for actual sections in the document but also at the end of every special text like comment, footnote, etc; only actual sections can set page styles. Writer comments use editengine so cannot even contain sections. * With continous section breaks, headers and footers are inherited from the previous section unless defined by the current section; SwXText::copyText() did not copy the content of the header on page 4 to page 5 correctly because it used an SwXTextCursor to create the selection, which cannot select the table at the start of the header. * For continuous section breaks, WW8 import filter has a heuristic to find the first page break in the section and set the PageDescName property on that node to apply the page style with the headers of the new section; do something similar in writerfilter SectionPropertyMap::CloseSectionGroup() Change-Id: I3ebe3d299f83197cbf8f10de46c34de98677626c Reviewed-on: https://gerrit.libreoffice.org/85213 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@cib.de> (cherry picked from commit 08f13ab85b5c65b5dc8adfa15918fb3e426fcc3c) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86911 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-16tdf#121670 ooxmlimport: no columns in page styles, only sectionsJustin Luth
LIKELY TO EXPOSE SECTION EXPORT/IMPORT PROBLEMS. That is already happening somewhat because support for forms/protected sections was added in LO6.2. By making this change, it will help to expose problems faster, with the hope that they can still be fixed for 6.2. Columns in page styles are very problematic, because it doesn't let you override the number of columns (except to put sub-columns inside one of the existing columns). So, always attempt to insert a column into it's own section, and never into the page style itself. I'm rather excited that this didn't cause any unit test failures. I've made a lot of section fixes over the years (and some this week which are required for the unit test to work). This change seems very natural, and gets rid of a regression-prone hack. I found all of the existing unit tests with columns and tested them. About 10 files - all look fine including complex files tdf81345.docx and tdf104061_tableSectionColumns.docx Change-Id: If02f1bfd91b1cf8210665244d0782ff926cc2869 Reviewed-on: https://gerrit.libreoffice.org/65557 Tested-by: Jenkins Reviewed-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit 14087d3e5fed9b56384432d9aeac608a5e8d86cf) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86910 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-10tdf#128207: DOCX import: fix chart positioningBakos Attila
Embedded graphic objects had got 0 values for vertical and horizontal positioning before, resulting overlapping, hidden charts, but now they are positioned according to the values in the document. (cherry picked from commit d9c535ead688e9f156dbcf43948df08a69e218be) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport14.cxx [ Miklos: reworked the testcase to use the UNO API, the other way would only work on master. ] Change-Id: Ia5403ac65ff7192d61072e8a9d8a7f80c7178b9b Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86557 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-09tdf#129205 DOCX import: handle the <w:shd w:val="nil" ...> paragraph propertyMiklos Vajna
Reading the spec, "nil" is the opposite of "clear": i.e. if the (background) color is red and the fill (color) is green, then "clear" means green. And you would expect "nil" means red, but it's just nothing in Word. Fix the problem by doing the same: don't set any paragraph property for the "nil" case and keep doing it for the common "clear" case. (cherry picked from commit fbe7612d654be9dfe1ea6f2e67900eb4eec4202a) Change-Id: I30af8a7fb55fb9bab2d12e120069a479fc7ab0a9 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86098 Tested-by: Jenkins Reviewed-by: Adolfo Jayme Barrientos <fitojb@ubuntu.com> (cherry picked from commit dfd75ec6d4dcfec57607a8cf7c7a509c33bf2caa) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86344 Reviewed-by: Andras Timar <andras.timar@collabora.com> (cherry picked from commit 62eee51aeaee380139126e21ac550e6e367164ab) Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86479 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2020-01-09DOCX import: fix lost page break when footer ends with a tableMiklos Vajna
Regression from commit 7d3778e0ef9f54f3c8988f1b84d58e7002d6c625 (bnc#816593 DOCX import: ignore page breaks in tables, 2013-09-02), the page break was ignored because the preceding footer ended with a table (no empty paragraph at the end of the footer stream). Fix the problem by saving/loading the table state around header/footers, that way the page break is not ignored. Adjust testTdf102466 to test the page number from Word. (cherry picked from commit a86a2a1c1ceb7203857d4317913c5b1bb9feb4aa) Conflicts: writerfilter/source/dmapper/DomainMapper_Impl.hxx Change-Id: Ia4c22452ee2c37f7f941dfd922db04c851644d0c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/86451 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-01-08tdf#129353, tdf#129402: fix node creation on index importMike Kaganski
ToC, bibliography, and index sections import code changed to closely follow what Word does, make sure that pre-rendered entries don't get imported as standalone paragraphs outside of the index sections, and paragraph count is accurate (no missing or added paragraphs as much as possible). In Word, an index may start and end in the middle of a paragraph: <w:p> <w:r> <w:t>Some text before index</w:t> </w:r> <w:r> <w:fldChar w:fldCharType="begin"/> </w:r> <w:r> <w:instrText> TOC ...</w:instrText> </w:r> <w:r> <w:fldChar w:fldCharType="separate"/> </w:r> <w:r> <w:t>First pre-rendered index entry</w:t> </w:r> </w:p> ... <w:p> <w:r> <w:t>Last pre-rendered index entry</w:t> </w:r> <w:r> <w:fldChar w:fldCharType="end"/> </w:r> <w:r> <w:t>Some text after index</w:t> </w:r> </w:p> However, normally it looks like either no runs precedig index, or no runs of pre-rendered contents will be present. When no Std elements are used, the typical situation is that there's a normal paragraph (possibly with some user text), which ends with index start marker, without any pre-rendered contents in the same paragraph; and all pre- rendered contents goes in following paragraphs. Such index normally ends with index end marker in the *first* run of a paragraph, which then might have normal text runs. When Stds are used, then no leading/trailing out-of-index runs in paragraphs with marks are usually present; and in this case, when paragraphs with index marks don't contain pre-rendered entries, they still are treated as part of the index. In Writer, indexes are node sections (and so cannot be inline with other paragraph contents). When there was some paragraph content already before the start-of-index mark, the paragraph is assumed to end before the index; in this case, when current <w:p> element ends, importer decides if a separate starting paragraph is needed or not, depending on if there was some runs after the mark. When there was no text runs before the starting mark, then the paragraph is treated as leading paragraph of the index. This allows to not miss empty paragraphs before index; and not have two paragraphs where there was one in Word. Only in cases when user had manually typed text both in and outside of the index in the same paragraph in Word, we would have the paragraph split into two in Writer. For end marks, the behaviour depends on whether it's inside Std. When inside, the ending paragraph starting with index end mark is considered part of the index. For out-of-Std case, it's considered normal paragraph (and measures are taken to make sure it's not dropped even if empty, because sometimes such paragraphs don't have other content, and have section settings, which is usually treated by Writer as "drop this paragraph" sign). A special problem is multi-column index. It's wrapped into a continuous section by Word; and in Writer, we also wrap it into a section. It would be possibly useful to detect somehow if this section is part of index definition, and in this case, drop the section and put its properties into the Writer's index section. That would avoid an explicit section in the imported document. This is TODO, for someone who figures how to detect reliably if the section belongs to index definition. See comment in DomainMapper_Impl::appendTextSectionAfter. By the way, current export code is wrong, producing an index that is single-column in Word; this change doesn't touch that. Several existing tests needed to be fixed, which used to test wrong results. Change-Id: I9597c8ab13f31ded9abcc24054d3478d3e3a3b40 Reviewed-on: https://gerrit.libreoffice.org/85089 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/c/core/+/85278 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2019-12-20DOCX table import: fix interaction of 1-cell rows and "inside" vertical bordersMiklos Vajna
The interesting part of the bugdoc was: - table style wants visible borders - table direct formatting clears left and right borders - 1st row of the table has 1 cell (2 cells in fact, but they are merged) Fix the "inside" vertical border handling, so that the first cell gets these vertical borders as a right border only in case there are multiple cells. (cherry picked from commit fd92740a86ab8e71e77d947d1d7dabc51a8d0794) Change-Id: Id847109ecfa95d1745abe62ddf36c4936b730855 Reviewed-on: https://gerrit.libreoffice.org/85578 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2019-12-17Related: tdf#115719 DOCX import: fix increased spacing vs left-aligned objectsMiklos Vajna
Commit 8b73bafbc18acb4dd8911d2f2de8158d98eb6144 (tdf#115719 DOCX import: increase paragraph spacing for anchored objects, 2018-02-14) added an import-time tweak for a problem that has been confirmed to be a Word layout bug in the meantime (and the tweak makes Writer behave the same way if the document has been created by an affected Word version for layout compatiblity). Later, commit 4883da6fd25e4645a3b30cb58212a2f666dae75a (Related: tdf#124600 DOCX import: ignore left wrap on left-aligned shapes, 2018-02-14) fixed left spacing of anchored objects aligned to the left, to be in sync with what the DOC import does. This broke the previous fix in case the shapes are left-aligned. Fix the problem by tracking what is the in-file-format and logical left margin, so the final doc model has the value necessary for correct horizontal positioning and the importer has the value that's necessary for correct vertical positioning. (cherry picked from commit 814cb2433da6bd608e935fa5531d2a2b92867985) Conflicts: writerfilter/source/dmapper/PropertyMap.cxx Change-Id: I8f16cbe7bad40e243111c902bdc1ab0e8141d6b9 Reviewed-on: https://gerrit.libreoffice.org/85265 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2019-10-31tdf#126544 writerfilter: check parent style exists before assigningJustin Luth
If you set the parent style to a style that is not yet created, then it silently fails, and thus inherits from nothing! Change-Id: Ibb85235643dd5b1eb9b0bd43f701580f24b2b7fa Reviewed-on: https://gerrit.libreoffice.org/76805 Tested-by: Jenkins Reviewed-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit b47a8f091ad8f9048a6b7962e9cde5d04ea0d665) Reviewed-on: https://gerrit.libreoffice.org/81749 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
2019-10-04Related: tdf#124600 DOCX import: ignore left wrap on left-aligned shapesMiklos Vajna
The DOC import does the same in SwWW8ImplReader::AdjustLRWrapForWordMargins(). This fixes one sub-problem of the bugdoc, so now the shape anchored in the header has a correct position. This made it necessary to re-visit the tdf#115719 testcases, which are minimal reproducers. The original document had from-left alignment (instead of align-to-left), but this did not matter before. Bring the test documents closer to the original large document, so the tests still pass and don't depend on LO mis-handling the above mentioned left-aligned situation. (The interesting property of tdf115719.docx, where Word 2010 and Word 2013 handles the document differently is preserved after this change.) (cherry picked from commit 4883da6fd25e4645a3b30cb58212a2f666dae75a) Change-Id: I973c13df47b0867e2c4756f0c448495257b7c9d5 Reviewed-on: https://gerrit.libreoffice.org/80041 Tested-by: Jenkins Reviewed-by: Xisco Faulí <xiscofauli@libreoffice.org>
2019-10-04Revert "tdf#117988 writerfilter: IgnoreTabsAndBlanksForLineCalculation"Justin Luth
This reverts LO 6.2 commit 49ddaad2f3ba4e17e1e41e94824fb94468d2b680. tdf#127617 proves it simply was not the correct solution. I replaced the unit test document with one that clearly demonstrates that spaces/tabs should NOT be used in line height calculations. Example document tested with Office 2003, 2010, 2016. Change-Id: I2833384a017526d665adef0cae968bc4aef0dd94 Reviewed-on: https://gerrit.libreoffice.org/79473 Reviewed-by: Justin Luth <justin_luth@sil.org> Tested-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit 202bee1a819de7b1e8c75dd863c4154f66419400) Reviewed-on: https://gerrit.libreoffice.org/79484 Tested-by: Jenkins Reviewed-by: Xisco Faulí <xiscofauli@libreoffice.org> (cherry picked from commit b0c5bc47d0d170df1384dd48cee9291ce6044083) Reviewed-on: https://gerrit.libreoffice.org/79527
2019-09-22tdf#123636 writerfilter: handle deferred breaks on framesJustin Luth
and... related tdf#123636 writerfilter: split newline also if PAGE_BREAK ...but only with old MSWord compat flag SplitPgBreakAndParaMark. All of the other cases (COLUMN_BREAK and non-empty runs) split the paragraph, so why not here? This document shows it is needed, but only for SplitPgBreakAndParaMark documents. Note: Word 2003 doesn't display "modern" docx well in this regard. It adds extra paragraphs where it shouldn't. There are already example unit tests that ensure that extra paragraphs aren't written for SplitPgBreakAndParaMark == false. The actual bug's document is not related to the compatibility flag. That will be handled in separate commit. Reviewed-on: https://gerrit.libreoffice.org/70835 Tested-by: Jenkins Reviewed-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit 89e44da1ab450f6e2f4106103efd169227683f20) tdf#123636 writerfilter: handle deferred breaks on frames ...similar to handling breaks before shapes in lcl_startShape. Three different examples found to create/split a paragraph. Which one to use? (addDummy, m_bIsSplitPara, and lcl_startCharacterGroup). SplitPara is not good because the paragraph properties probably should not be copied to the dummy paragraph (like numbering for example). Slightly modified the lcl_startChar example to ensure that the dummy paragraph doesn't steal a part of the properties, but is only default properties plus page-break. This doesn't export very well, so roundtripping is very poor. Research Note: There exists a compat flag showBreaksInFrames (Display Page/Column Breaks Present in Frames) "This element specifies whether applications should honor the presence of page and/or column breaks which are present within the contents of paragraphs which have been defined as frames using the framePr element." --Currently, LO does nothing with this flag. Probably too exotic and irrelevant (word 2003 era?). No existing unit tests found that have isSet(pg_brk) frames. Reviewed-on: https://gerrit.libreoffice.org/71255 Tested-by: Jenkins Reviewed-by: Justin Luth <justin_luth@sil.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.com> (cherry picked from commit f6f53f76e15f5eecc5b6ce56e471c53cebfea8ad) Change-Id: I29f815355401c7af8b347a3ed9d0298bc9b27b93
2019-08-29tdf#126723 writerfilter::finishParagraph - me, not previousJustin Luth
In LO 6.2 commit 480ac84f2f5049fb4337b36f12fd6796e005761b the existing m_xPreviousParagraph was conveniently used to apply the changed properties. I never did like that choice, but despite looking at it, I failed to see that it is set in an inside loop, which means that it was NOT NECESSARILY reset to the current paragaph. So I'm happy to have proof that we should not use m_xPreviousParagraph. Reviewed-on: https://gerrit.libreoffice.org/77185 Tested-by: Jenkins Reviewed-by: Justin Luth <justin_luth@sil.org> Reviewed-by: László Németh <nemeth@numbertext.org> (cherry picked from commit d03c92b93d6ba1808a6641b4aa8cb4aae38058bf) Change-Id: I5c7f1b0f097711d65ae0d0be1f0fbc40c8b96e9d Reviewed-on: https://gerrit.libreoffice.org/77249 Tested-by: Jenkins Reviewed-by: Michael Stahl <Michael.Stahl@cib.de> Reviewed-on: https://gerrit.libreoffice.org/77906 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Andras Timar <andras.timar@collabora.com>
2019-08-23tdf#119809: FILESAVE DOCX The combo box ActiveX control is lostTamás Zolnai
The problem was with the empty combobox. The implemenation before this commit imported a combobox only when the combobox had any item. Reviewed-on: https://gerrit.libreoffice.org/78024 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 3ceefe9abff98fc24ffb5e8e405f4999faddc351) Change-Id: I945098277d1ed34c65b43f0f6ad8eb361cf41b53
2019-08-21tdf#123702 RTF/DOCX default 1" left/right page marginsLászló Németh
since MSO 2007: now 1440 twips = 2540 1/100 mm (it was 1800 twips = 3175 1/100 mm). Changing the default value fixes the layout of the documents based on RTF templates without explicit margins. Change-Id: I0395fb7cdd6ba176f266c8f0a9309ba48a047da3 Reviewed-on: https://gerrit.libreoffice.org/76812 Tested-by: Jenkins Reviewed-by: László Németh <nemeth@numbertext.org> (cherry picked from commit 2550b380e8f81528aa2dde5790c3b607c068ee1a) Reviewed-on: https://gerrit.libreoffice.org/77000 Reviewed-by: Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de> Tested-by: Samuel Mehrbrodt <Samuel.Mehrbrodt@cib.de> Reviewed-on: https://gerrit.libreoffice.org/77895 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
2019-08-15sw btlr writing mode: RTF filter of Writer textframesMiklos Vajna
Both import and export needed fixing. (cherry picked from commit be92468ae3595d4384510a9cc0e15629c7cb0692) Conflicts: sw/qa/extras/rtfexport/rtfexport4.cxx Change-Id: Ie1728c3e67d8637e3720748d7f61a69c058eafe3 Reviewed-on: https://gerrit.libreoffice.org/77506 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2019-08-09sw btlr writing mode: RTF filter of Writer tbrl textframesMiklos Vajna
Fix both import and export. Values 1 and 3 seems to be the same. Accept both on import, but write 3, as DOCX only has a single value and Word uses 3 when doing DOCX->RTF conversion. Reviewed-on: https://gerrit.libreoffice.org/76823 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com> (cherry picked from commit 8f36d40426fa83bf7923a818377cc50048199dfd) Conflicts: sw/qa/extras/rtfexport/rtfexport4.cxx Change-Id: Ic5420091ffee9eb20c6aaac61a127e93289aa9fe Reviewed-on: https://gerrit.libreoffice.org/77200 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2019-08-05pretty up logging of exceptionsNoel Grandin
Add exceptionToString() and getCaughtExceptionAsString() methods in tools. Use the new methods in DbgUnhandledException() Add special-case case code for most of the exceptions that contain extra fields, so all of the relevant data ends up in the log Change-Id: I376f6549b4d7bd480202f8bff17a454657c75ece Reviewed-on: https://gerrit.libreoffice.org/67857 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2019-08-05tdf126701: MSForms: Fix import of date field at the end of the paragraph.Tamás Zolnai
We need to create date field before the paragraph is finished (line break is inserted). A date field can not span between paragraphs. Extend other related unit tests too. In other use cases, the field content changes to an invalid data. Reviewed-on: https://gerrit.libreoffice.org/76971 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit b36ef83ea59eeaca239e58b95aa0b1acdcb99efc) Change-Id: Id274649e0aaaf6e3c31e042afd126cefc368c858 Reviewed-on: https://gerrit.libreoffice.org/76977 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com>
2019-08-04tdf#126173 RTF import: fix lost SHAPE fieldsMiklos Vajna
Commit 5a5d55a8a0f82406a8001015a723596f21d3562c (fdo#82860 RTF import: fix handling of SHAPE fields, 2014-10-15) already tried to handle this, but aCode is the shape command + its parameters (SHAPE \* MERGEFORMAT) for the bugdoc, while what we want is just the shape command. The field variable already contains a tokenized version, which was used previously only to decide if a field is unhandled or not. Reuse that for the shape comparison, so bugdoc's shape with parameters also appears. Change-Id: I7e044b94bcfab490c956b33c11dd6c69443939f5 Reviewed-on: https://gerrit.libreoffice.org/75243 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com> (cherry picked from commit 9a15a75dfa7ab8c5d51c411e0e39d68d22b7587a) Reviewed-on: https://gerrit.libreoffice.org/75288 Reviewed-by: Xisco Faulí <xiscofauli@libreoffice.org> (cherry picked from commit 0e6fdee15df8928c33308b353a7b80de150aca6b) Reviewed-on: https://gerrit.libreoffice.org/75295 Reviewed-by: Michael Stahl <Michael.Stahl@cib.de> (cherry picked from commit 45cf5d55221b92e395948cb2e36d6ae6f056b1a3) Reviewed-on: https://gerrit.libreoffice.org/76919 Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com> Reviewed-by: Andras Timar <andras.timar@collabora.com>
2019-07-23MSForms: DOCX filter: fix crash when the date field is inside a shapeTamás Zolnai
Change-Id: Ida6ff48e6e743e41dd793e31c11065f870e8959b Reviewed-on: https://gerrit.libreoffice.org/76117 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit d163b651dc3dd017cdb3327d87a7cf88003238e9)
2019-07-17MSForms: Rework text-based date form field's representationTamás Zolnai
* Better to represent it similar to text form field with two marking characters selecting a text range * So the text between the two marks can be anything (not only a well formatted date) and also have any character formatting. * With this we handle the case when the user needs a placeholder text in the date field or when the user needs time values (hour, minute, sec) next to the date. Reviewed-on: https://gerrit.libreoffice.org/75459 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 68e1be4ccbb90ee9a788962219a88312c4ffbea2) Change-Id: Id60a50a2028058f8a6a080e265c0730d88b98543
2019-07-17MSForms: DOCX filter: The new text-based date field is allowed in the header.Tamás Zolnai
Change-Id: I71d61c702ccd0470c4c3df09531704783c1b3e01 Reviewed-on: https://gerrit.libreoffice.org/75457 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit a50d82eca96d04b4cea1ea2c7b3610bf9ed951f0)
2019-07-17MSForms: DOCX filter: handle date formats with quotation marks.Tamás Zolnai
Reviewed-on: https://gerrit.libreoffice.org/75454 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 9c2feb75a6104d4376cccb157244dd7f6e88968a) Change-Id: I61cc6d47200acdd55f147b4f1829330dec8562a0
2019-07-17MSForms: DOCX filter: import manually set date field as plain textTamás Zolnai
In MSO the user can add any text in the date field without having it in the specified date format. We import this kind of date as plain text. Change-Id: Ied4bf03a3ac4c9b6f1cfc78d91e6a52ad3d6e179 Reviewed-on: https://gerrit.libreoffice.org/75452 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 52205f85582aaaee04fcfffd1c1729454f512400)
2019-07-17MSForms: DOCX import of text-based date fieldTamás Zolnai
* Before the date content control was imported as LO specific date form control, but now I changed it to be imported into the compatible text-based date field. * Also removed the things stored in the grabbag, which are useless now. * Disabled some unit tests, I'll update them for the new field in other patches. Reviewed-on: https://gerrit.libreoffice.org/75447 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit df4fe4504f6d966d1d92433862dc1baf2ba008d4) Change-Id: Ide8f4b27ec6b2dbb182abb4180229736bf9c434f
2019-07-05sw btlr writing mode: handle import from VMLMiklos Vajna
Instead of the character-level rotation added in commit 8738ded7bb1bb6262fe1038e310b5110407f4cfa (fdo#69636 VML import: handle mso-layout-flow-alt shape prop for sw frames, 2013-09-26) which does not work for multiple paragraphs. (cherry picked from commit bffe6a496fb1c69499770d96fefd7a3609712676) Conflicts: writerfilter/source/dmapper/DomainMapper_Impl.cxx Change-Id: Ibe9a85d7f880846edfd1f4594c03b0617d83a965
2019-07-04tdf#124344 sw btlr writing mode, DOCX import: fix vertical alignmentMiklos Vajna
The hack added in commit 3325e0f206ce864730468c3556ce06760042c157 (bnc#865381 DOCX import: handle w:jc=center inside w:textDirection=btLr, 2014-07-02) is no longer needed, actually just reverting it fixes the problem, as then layout does the right thing. No need to center paragraph adjustment to any kind of vertical orientation, now that we have proper layout support. Change-Id: I6aa74f5289a014c148fbd7c7ab03ec885d931daf Reviewed-on: https://gerrit.libreoffice.org/70610 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com> (cherry picked from commit 0013f21ecd918e0541f165c3526a58f42dd75481)
2019-07-03sw comments on frames: fix DOCX handlingMiklos Vajna
We used to ignore annotation marks which just cover the comment anchor since commit fff019debf14a0bf8cd358591a686191347f1542 (MSWordExportBase: ignore empty annotation marks, 2014-09-17), but this means comments on images are lost. Pass around SwWW8AttrIter, so we can decide if we have a relevant at-char anchored frame in MSWordExportBase::GetAnnotationMarks(), without iterating over all frames in the document, which would be slow for large documents. Regarding the import side, the only problem was that the empty comment range resulted in a loss of annotation marks; fix that by using a marker while inserting. Change-Id: I385677d74423bc05824dac4a12d1a991bb3983c4 Reviewed-on: https://gerrit.libreoffice.org/74996 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins (cherry picked from commit 7fa96a3e4bdd384ad411e0bdc4e7c3f2ab920279)
2019-07-03sw btlr writing mode: implement DOCX filterMiklos Vajna
Replace the old trick with character-level rotation with the usage of the new writing direction. This means that finally table cells with btlr text direction and multiple paragraphs show all content, not only the first paragraph, as before (seen as data loss by users). (cherry picked from commit 8fdbda18b593e7014e44a0fd590bbf98d83258b7) Conflicts: writerfilter/source/dmapper/DomainMapperTableManager.cxx Change-Id: I094f36fa6ba0701579e487e8e0212707987b1b2f
2019-06-29tdf#126114 - Form fields are displayed twice (double)Tamás Zolnai
We need to make sure that IsFieldResultAsString() returns true for drop-down field, to ignore the placeholder string. Change-Id: I127800bdff78eb68e000fdbfe433bc88181ac2c3 Reviewed-on: https://gerrit.libreoffice.org/74752 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 8e5982d799e23bee86404f3ccb3aaed524ae9675) Reviewed-on: https://gerrit.libreoffice.org/74802 Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com>
2019-06-19MSForms: Introduce a new IFieldMark for drop-down form fieldTamás Zolnai
* It was weird anyway that a drop-down form field was represented as an CheckboxFieldmark. * It will be useful for later commits, to have a separate field type for drop-down field. * Needed to fix-up the API a bit because it was designed to specify the field type after initialization. I solved it in a way to not break the API behavior. Hopefully it's not very slow. Reviewed-on: https://gerrit.libreoffice.org/68960 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit f66a83c95c21b4311918a64bb85016857b49f4d4) Change-Id: I3103e6b1c36289b27b62ab9ca7dfeebc14901c8a Reviewed-on: https://gerrit.libreoffice.org/69194 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2019-06-19tdf#122658: Empty date form field is not exported correctly to DOCX fileTamás Zolnai
We need to export date format and also text content in case of empty date field. Otherwise the exported date field will be lost during import into LO Writer or MSO Word. Reviewed-on: https://gerrit.libreoffice.org/66194 Tested-by: Jenkins Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 24613d7abf820aff639a276a1819ada8d83e9063) Change-Id: I5cf65bedba010f64ca8f56262057f3cce32b0943 Reviewed-on: https://gerrit.libreoffice.org/66289 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2019-06-18DOCX import: fix unexpected page break on autotext insert at end of docMiklos Vajna
The problem was that the page style was set on the first paragraph, which means a page break on the UI. So if you used a multi-paragraph autotext twice (insert autotext, press enter, insert autotext again) then you ended up with 2 pages instead of just 1. Fix the problem by tracking when we are in autotext import mode, and similar to pasting, don't set the page style in autotext import mode. Change-Id: I4fc551b3c1b999687eb80242e261f186fd1b6f13 Reviewed-on: https://gerrit.libreoffice.org/69214 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins (cherry picked from commit adcf656bb56e09fbb638a44b0cccc96f8cfced7f)
2019-05-24tdf#123460 DOCX track changes: moveFrom completelyLászló Németh
also with the paragraph mark, not leaving an empty paragraph at the original place of the moved text. Note: as desktop version of MSO does, but its online version leaves empty paragraphs interestingly. Change-Id: I03dda8997df3efbc82e936bd31a3813323e6b5ab Reviewed-on: https://gerrit.libreoffice.org/71382 Reviewed-by: László Németh <nemeth@numbertext.org> Tested-by: László Németh <nemeth@numbertext.org> (cherry picked from commit d32d9a2b3c5e3963f4a18f6c7bbf50fab2e9b2be) Reviewed-on: https://gerrit.libreoffice.org/72718 Reviewed-by: Michael Stahl <Michael.Stahl@cib.de> Tested-by: Jenkins
2019-05-10tdf#124594 DOCX filter: don't extend margins from effects for rotated shapesMiklos Vajna
Regression from commit a5a836d8c43dc9cebbbf8af39bf0142de603a7c6 (DOCX filter: effect extent should be part of the margin, 2014-12-04), the problem was that extending margins as-is based on the effect extent values only work correctly in case of non-rotated shapes. For example, with 90 degree clockwise rotation the top effect extent should extend the right margin, etc. Fix the bug by limiting this extension to the non-rotated scenario. Test the behavior at a layout level, so in case later the effect extent feature is implemented, it won't be necessary to adjust the test. (cherry picked from commit 65420c21194a28aeead0238838028b734b663d87) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport13.cxx Change-Id: I97271bbb7c079951980b436cb8d8e5e54eeead55 Reviewed-on: https://gerrit.libreoffice.org/71893 Tested-by: Jenkins Tested-by: Xisco Faulí <xiscofauli@libreoffice.org> Reviewed-by: Michael Stahl <Michael.Stahl@cib.de>
2019-04-15tdf#124670: xml:space attribute may be specified for w:document root elementMike Kaganski
Treat xml:space specially in OOXMLFastContextHandler::startFastElement, to allow this attribute to be handled for any element. Change-Id: I81bd1e0642940ffdfc03d6c65d0ce9f421206c5e Reviewed-on: https://gerrit.libreoffice.org/70723 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/70725 Tested-by: Jenkins Reviewed-by: Michael Stahl <Michael.Stahl@cib.de>
2019-04-13writerfilter: implement RTF derived styles defaultingMichael Stahl
It turns out that the situation fixed in commit 1be0a3fa9ebb22b607c54b47739d4467acfed259 also applies to the definition of the styles themselves. To implement the same style import as Word, the style definitions need to be stored twice: once as read from the file, and another time with attributes defaulted and deduplicated vs. the parent style; the second representation is then sent to the domain mapper. To make this easier, add a bool parameter to cloneAndDeduplicate() to disable the implicit pPr dereferencing that happens when creating the hard formatted paragraph properties (this could potentially be cleaned up further if those paragraph properties would use pPr wrapper themselves). Also implement defaulting of line spacing in getDefaultSPRM(). Reviewed-on: https://gerrit.libreoffice.org/70320 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com> (cherry picked from commit 3d74ddd190a5087e0a54ef7b14d0a43006745ec3) Change-Id: I4810e917697b3af244e5dbdd7f5a45b4767c93fc Reviewed-on: https://gerrit.libreoffice.org/70508 Tested-by: Jenkins Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2019-03-29tdf#124384 sw DOCX: fix crash during bibliography loadingSerge Krot
Change-Id: Ic0c4b6f7480a4c6c3f53bd04e285cb0cab172531 Reviewed-on: https://gerrit.libreoffice.org/69888 Tested-by: Jenkins Reviewed-by: Thorsten Behrens <Thorsten.Behrens@CIB.de> (cherry picked from commit 8a76b845e0376fd39014d6180c78b863f373633f) Reviewed-on: https://gerrit.libreoffice.org/69933 Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2019-03-29tdf#121456 sw: DOCX: fix loading of empty TOC titleSerge Krot
Change-Id: Ib241edd07e4c6781d80db274f73146bda310d8c0 Reviewed-on: https://gerrit.libreoffice.org/69827 Tested-by: Jenkins Reviewed-by: Thorsten Behrens <Thorsten.Behrens@CIB.de> (cherry picked from commit e47a5543f4b8c9e317d1e43af8c0e5a732e461fd) Reviewed-on: https://gerrit.libreoffice.org/69902