summaryrefslogtreecommitdiff
path: root/writerfilter
AgeCommit message (Collapse)Author
2018-01-17tdf#112352 ooxmlimport: ALWAYS treat 1st nextpage w/cols as contJustin Luth
fix 5.4 regression from 4605bd46984125a99b0e993b71efa6edb411699f. When there are columns, if a nextpage section doesn't contain any other "page style" details we treat it as a continuous break, If we don't, the column info becomes part of the style itself, and not just a section property. However, the very first section is troublesome - by definition it DOES contain page style details, and so if the document starts with columns, the default style would gain the column attribute. Usually that results in a mess, so lets make sure that we avoid that also in the case where headers/footers are defined. Reviewed-on: https://gerrit.libreoffice.org/44505 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Justin Luth <justin_luth@sil.org> Tested-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit afc96d263959d10e457b54a574f0829d20e99df4) Change-Id: I7e08a9218e4304206579ed064bc92c9604d4470e Reviewed-on: https://gerrit.libreoffice.org/46638 Reviewed-by: Justin Luth <justin_luth@sil.org> Reviewed-by: Jan Holesovsky <kendy@collabora.com> Tested-by: Jan Holesovsky <kendy@collabora.com> (cherry picked from commit 2bdfe1355c4c571e71bd4197d5814c6e15fb8db7)
2017-11-15tdf#111964: only trim XML whitespaceMike Kaganski
OUString::trim() uses rtl_uString_newTrim, which relies upon rtl_ImplIsWhitespace. The latter treats as whitespaces not only characters with values less than or equal to 32, but also Unicode General Punctuation area Space and some Control characters. Thus, using OUString::trim() is incorrect when the goal is to trim XML whitespace, which is defined as one of 0x09, 0x0A, 0x0D, 0x20. A unit test included. Change-Id: I45a132be923a52dcd5a4c35aeecb53d423b49fec Reviewed-on: https://gerrit.libreoffice.org/41444 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/44746 Reviewed-by: Aron Budea <aron.budea@collabora.com> Tested-by: Aron Budea <aron.budea@collabora.com>
2017-11-08tdf#105688: findZOrder: catch exceptions from getPropertyValueMike Kaganski
For some reason, sometimes items in GraphicZOrderHelper don't have ZOrder property value, and so throw in getPropertyValue. E.g., SwXFrame::getPropertyValue throws uno::RuntimeException when its GetFrameFormat() returns nullptr. The patch catches these to allow to proceed with fallback z-order. Change-Id: I96140195f45364bccee7c5547d373158e2b49154 Reviewed-on: https://gerrit.libreoffice.org/37392 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit f66b76a4d20719e4c13bd755c49f8140a0e72816) Reviewed-on: https://gerrit.libreoffice.org/44463 Reviewed-by: Aron Budea <aron.budea@collabora.com> Tested-by: Aron Budea <aron.budea@collabora.com>
2017-09-19Word 2013 and 2016 does not honor the <w:view> setting, let's ignore it too.Jan Holesovsky
In other words, let's open documents in the non-web view even when saved with <w:view w:val="web"/>. The behavior I see in Word 2013 (and it's documented that his happens in 2016 too) is that the setting is not a document setting any more, but user's setting. Ie. regardless of what is written in the file, the .docx document opens in the Print Layout if the Word was in the Print Layout until now, and in the Web Layout if it was that mode. We handle the non-web layout much better than the web layout, so let's just default to the normal layout on load. Change-Id: Ieba7ddc280b9b79501a6b89ff21b03a86356583c Reviewed-on: https://gerrit.libreoffice.org/42414 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Jan Holesovsky <kendy@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/42412 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-09-12tdf#112304 Revert "Watermark: not visible if page background was set"Szymon Kłos
This reverts commit 39c08074a286855dd014ce1c30b8f7ef95b10242. Fixed by: I69517efb7d82acd719d6a27a09ba61554dbf1ec9 Change-Id: Icd45b3f55292670ff7338a367eba212453a0687e Reviewed-on: https://gerrit.libreoffice.org/42155 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/42165 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-09-12tdf#112346: take Word no-wrap limit into account also for ww8Mike Kaganski
This also makes ww8 floating-table conversion decision heuristics somewhat closer to OOXML code. Change-Id: I29ca2ebabd1758ad98e02aaf560cf2f44daec3a8 Reviewed-on: https://gerrit.libreoffice.org/42196 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Reviewed-on: https://gerrit.libreoffice.org/42216 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-08-25Watermark: not visible if page background was setSzymon Kłos
Watermark was drawn under the page background. It has to be placed on the upper layer to be visible. Change-Id: I132a313eed6fb712aafdca14a38fe559aa4231c8 Reviewed-on: https://gerrit.libreoffice.org/41289 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/41557 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-08-25tdf#109184 auto cell color should be transparentSzymon Kłos
Don't add color to the property map if is set to auto. In this case white color was assumed and tables were white instead of transparent. Reviewed-on: https://gerrit.libreoffice.org/41255 Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Tested-by: Szymon Kłos <szymon.klos@collabora.com> (cherry picked from commit d239bf6d79e93f650a4241fcd2da0cb77c9cb95b) Change-Id: I7f203b8f3831b86ba8de33dc57de227b3029c6d9 Reviewed-on: https://gerrit.libreoffice.org/41451 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-08-25tdf#50097: DOCX: export form controls as MSO ActiveX controlsTamás Zolnai
* Use the same structure for export what MSO uses ** Position and size information are exported as VML shape properties ** Different handling of inline and floating controls (pict or object) ** Do some changes on VML shape export to match how MSO exports these controls ** Write out activeX.xml and activeX.bin to store control properties ** Use persistStorage storage type defined in activeX.xml * Drop grabbaging of activex.XML and activeX.bin * Cleanup control related test code Signed-off-by: Tamás Zolnai <tamas.zolnai@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/41256 Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit c0cc02e2934aeb12dda44818955e5964496c186a) Change-Id: I38bb2b2ffd2676c5459b61ec2549c31348bab41c This test intended to be an export test Change-Id: Ib233bd603185efdb85ed30f3d00c28512d57a0ac Reviewed-on: https://gerrit.libreoffice.org/41355 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit a7e8c5304b740cb4e03e25b7217ce6071c29c09b) Fix two issues in ActiveX DOCX import / export code * Inline anchored VML shape had wrong vertical position ** In MSO inline shapes are positioned to the top of the baseline * During export all shape ids were the same (shape_0) ** VML shapes used to be exported only as fallback, I guess that's why it did not cause any issue before. ** Override the shapeid generator with a new one, which actually generates unique shapeids. Reviewed-on: https://gerrit.libreoffice.org/41319 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 2d1fe7fb67ec1ff1b96912c0945d17d54aecb12e) Change-Id: I752f39d092d0b61d91824141655dae662dbeafbc DOCX: Fix an other test case of ActiveX control export When LO control is anchored to the end of the run, it is exported into a new run. Reviewed-on: https://gerrit.libreoffice.org/41472 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit b129421764ae78a1422812169fce8eb4914a6b22) Change-Id: I9269fd1b34924780aad61c452d1e2094dc8e4aad Reviewed-on: https://gerrit.libreoffice.org/41484 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-08-25tdf#91384: DOCX: import ActiveX controlsTamás Zolnai
Reviewed-on: https://gerrit.libreoffice.org/40930 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 4a764319cbad4e2589cc105145ac27defbf49ff6) Change-Id: Iebf2ff65fcec3231acfc962fb2f1abc2ed2dc67a Avoid warning in OleHandler Related to ActiveX controls. Change-Id: Ief7ee67ca8e4f086a1d5e0400d0eaf3ebc8cdaaf Reviewed-on: https://gerrit.libreoffice.org/40934 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit 368b583b992f2e9cad46c2362c9529a07c36d7a9) Reviewed-on: https://gerrit.libreoffice.org/41483 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-08-10tdf#111550: A workaround for out-of-order (in-paragraph) tbl on OOXMLMike Kaganski
Word allows <w:tbl> to be direct child of <w:p>, which is illegal according to ECMA-376-1:2016. This allows for import the data in such tables (previously, this text was simply dropped, causing dataloss) - bug-to-bug compatibility with Word. Change-Id: I19c17ab19915ea46685727c635476fe5df593212 Reviewed-on: https://gerrit.libreoffice.org/40909 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 67a61e54531801645d51ad89aac30064b8c4b4e8) Reviewed-on: https://gerrit.libreoffice.org/40949 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-08-10Revert "A temporary workaround for out-of-order (in-paragraph) tbl on OOXML"Mike Kaganski
A better fix follows This reverts commit 0eb0c7308ad57f4a20b5691d450b5185e52475f6. Change-Id: If36f73c580d96445086d8ab3d87fff6a76cd8b6a Reviewed-on: https://gerrit.libreoffice.org/40948 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-08-04tdf#108944 writerfilter: fix missing footnote separatorJustin Luth
Fix regression from e79ef12b7a904f17d4147fa409d055c12b70f952 tdf#107033 DOCX import: fix unexpected missing footnote separator. Initially related to tdf#68787. If HandleMarginsHeaderFooter was called twice, then it automatically would have disabled the separator. Clearing the HasFtn/HasFtnSep flags also shouldn't be run when in the footnote sections. Reviewed-on: https://gerrit.libreoffice.org/40551 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Justin Luth <justin_luth@sil.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit 6f57c09aadd40009173f8ae3654004dd0cad9fb8) Change-Id: I00cbd1cbc8dc86edf426f852c59c3f943e373b13 Reviewed-on: https://gerrit.libreoffice.org/40590 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com> (cherry picked from commit 234df2fb5901588ccf20cb35cb4c5922aeb89817)
2017-07-28tdf#109524: use 100% table width when there's no explicit width availableMike Kaganski
According to ECMA-376-1:2016 17.4.63, 17.18.87, etc, all table widths are considered preferred, and actual table layout should be determined using an algorithm described in 17.18.87. When w:tblLayout element is omitted, or there is no explicit width information given, it is assumed that AutoFit Table Layout should be used, i.e. using cells content to determine final widths of table grid. In the description of the AutoFit Table Layout algorithm, it is stated that the table width grows to hold data, but no more than page width. As a first approach, this commit just sets table width to 100% when there's no width data available. TODO is to implement the AutoFit Table Layout algorithm properly. Change-Id: I000c548eb152c70d2c6e053f4d2b1d16e8976c27 Reviewed-on: https://gerrit.libreoffice.org/40500 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit cae5dd9363b68dbabbeb2069f4aee7d057f6b5a8) Reviewed-on: https://gerrit.libreoffice.org/40508 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-07-20tdf#108849: allow out-of-order sectPrMike Kaganski
According to ISO/IEC 29500-1:2016(E) 17.6.17), the final <w:sectPr> must be the last child element of the body element. Also, this is enforced in schema for CT_Body complex type (Annex A. (normative) Schemas – W3C XML Schema, A.1 WordprocessingML, page 3866), where sectPr is a part of <xsd:sequence>, and thus *must* stay at specific place in sequence, namely being the last element, and be at most one instance. However, real-life documents (generated by some third-party software) have sectPr before other body contents. Unfortunately, MS Word seems to allow this standards-violating content, and thus encourages creation of non-standard documents by third-party generators. This patch doesn't assume that current final (body-level) sectPr is the last body element, and does not mark current paragraph as last section's paragraph. Thus, current section (possibly started after previous paragraph-level sectPr) is continued after final sectPr is closed. Change-Id: I8e88288bc6659d77d17986514b3b4fe16a5b45d9 Reviewed-on: https://gerrit.libreoffice.org/40161 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 4b4cd502806cfc9c9cc9754b8aae18a2c2632cdc) Reviewed-on: https://gerrit.libreoffice.org/40216 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-07-18A temporary workaround for out-of-order (in-paragraph) tbl on OOXMLcp-5.3-20Mike Kaganski
This allows for import the data in such tables (previously, this text was simply dropped, causing dataloss). Layout problems are not fixed yet. Change-Id: Id7422adfe0998d1e2adcd4bf0b0e0a1dd7ed37bf Reviewed-on: https://gerrit.libreoffice.org/40105 Reviewed-by: Aron Budea <aron.budea@collabora.com> Tested-by: Aron Budea <aron.budea@collabora.com>
2017-07-14tdf#109053: DOCX: Multipage table is not imported properlyTamás Zolnai
An other use case when converting to a "floating table" is not a good idea. In this case we can check whether next to the table anything fits in the text area. If not then we can avoid floating table conversion. Reviewed-on: https://gerrit.libreoffice.org/39811 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> (cherry picked from commit fc55711f01af172eb3a034454405fa941454c781) Change-Id: I798a2f4c7a9dfe6aecbe4a73e3162b49ea5f0adc Reviewed-on: https://gerrit.libreoffice.org/39930 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-07-12Fix tdf#106029 - Add setting XML_doNotExpandShiftReturn when exporting to docxnikki
Change-Id: Ie8ffb0f2d5444c6ead13bdc894715c5a2e6d0baa Reviewed-on: https://gerrit.libreoffice.org/36485 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit 9ad9c5183f348384b62ec88459a3a5922e423d83) Reviewed-on: https://gerrit.libreoffice.org/39749 (cherry picked from commit a59cf3ecab2f327801c2b580d20df9e8b643cc6c)
2017-07-12tdf#109063 DOCX import: consider wrap space for multi-page floattablesMiklos Vajna
Follow-up to commit 78d1f1c2835b9fae0f91ed771fc1d594c7817502 (fdo#68607 bnc#816593 DomainMapperTableHandler: don't always start a frame, 2013-09-03), turns out in case there is little space between the table and the edge of the body area, then there is no wrapping performed in Word, so we should not convert to floating table, either. The limit seems to be 266 twips (mm100 unit is used in the code), and this seems to be constant: it does not change if both the table and the page width is changed, nor does it change when the empty paragraph to be wrapped has a different paragraph mark size. For the majority of the documents this means no change as usually there is either no space available for wrapping or there is a lot more available. (cherry picked from commit 25445d24cfa87522ee4c47e4aa7e6e816cdc9a36) Conflicts: writerfilter/source/dmapper/PropertyMap.cxx Change-Id: Ibbf7409065ba958854514f23b360be56677c8fe3 Reviewed-on: https://gerrit.libreoffice.org/39828 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com>
2017-07-11tdf#108545 show an icon (DOCX inside DOCX)Szymon Kłos
If DrawAspect is equal "Icon", show an icon not document preview Document is opened in the separate window, not in-place. Change-Id: I3a8d81e7340b29d247f8ac440c06b0420bb65644 Reviewed-on: https://gerrit.libreoffice.org/39440 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/39716 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-07-11tdf#108544 edit in window (XLSX inside DOCX)Szymon Kłos
Change-Id: If1dd46643dc2ae9cc74ba94038609ae3445a416c Reviewed-on: https://gerrit.libreoffice.org/39706 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> (cherry picked from commit 505ce3a2ba3adeef46daecbf9b14c42cea211408) Reviewed-on: https://gerrit.libreoffice.org/39715 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-07-07tdf#108995: take xml:space attribute into accountMike Kaganski
See paragraph 2.10 of XML 1.0 specification and 17.3.3.31 of ECMA-376-1:2016 Change-Id: I7f19d3b9cf2ccce88a5fa03022beeb99facc04fe Reviewed-on: https://gerrit.libreoffice.org/39682 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 7c1a51516aaf2767e43b393259a1ad21570df5fb) Reviewed-on: https://gerrit.libreoffice.org/39688 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-07-07tdf#108714: Also support paragraph-level (line) breaksMike Kaganski
Change-Id: Ida55015363cac3ae29b82a60a9b9a5f1b39086a2 Reviewed-on: https://gerrit.libreoffice.org/39675 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit f95f0ce163743706a3670c6e33593023c22af2ff) Reviewed-on: https://gerrit.libreoffice.org/39677 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-28tdf#108714 follow-up: handle deferred break in character groupMike Kaganski
If an out-of-order break happens immediately after a table, then in following paragraph group (before character group start) the table level is > 0, and break is ignored. Since out-of-order break only happens at top level, the following character group necessarily designates a new paragraph group, so it's OK to handle that at the character group level, where table level is already updated. Change-Id: Ic1b1bb89e12407b050c2e880ad971794311845a5 Reviewed-on: https://gerrit.libreoffice.org/39347 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 553204015f954d20db65e6adcda68b823a8ef235) Reviewed-on: https://gerrit.libreoffice.org/39352 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-06-27tdf#108806: convert CRLF into space in OOXML textMike Kaganski
Change-Id: I8e2e108a705ecdb55c096a589d83d51c48b0b83c Reviewed-on: https://gerrit.libreoffice.org/39286 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-on: https://gerrit.libreoffice.org/39322 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-27tdf#108714: allow <w:br> as direct child of <w:body>Mike Kaganski
LibreOffice doesn't accept <w:br> element as a child of <w:body>. ECMA-376-1:2016 17.3.3.1 describes br as element of a run content, and points to CT_Br in §A.1. CT_Br may appear only as part of EG_RunInnerContent. In turn, EG_RunInnerContent may appear only inside CT_R. So, using <w:br> outside of <w:r> produces ill-formed OOXML. Open XML SDK 2.5 Productivity Tool for Microsoft Office confirms that, showing OpenXmlUnknownElement error. However, Word accepts it as direct child of <w:body>. It behaves as if the <w:br> were used as first element in first run of the following <w:p> (thus creating page break after next paragraph). Another Word bug that provokes third-parties to create ill-formed documents, and requires LibreOffice to be bug-to-bug compatible. This commit makes the following changes: 1. Registers a dedicated complex type CT_Br_OutOfOrder to handle those unusual breaks, with corresponding handler function. 2. In the handler function, saves the gathered property set to parser state to use later in next paragraph group handler. This reproduces Word behaviour. Change-Id: I5df6927e2de9266b58f87807319ad1c4977e45a7 Reviewed-on: https://gerrit.libreoffice.org/39168 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit a4a1467bc47b81ad68ecad0d5e2e163670582919) Reviewed-on: https://gerrit.libreoffice.org/39303 Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-23Related: tdf#108269 DOCM filter: preserve VBA streamMiklos Vajna
This is a combination of 3 commits (initial support, then two refactor commits to not duplicate code.) 1st commit: This means 2 new streams when roundtripping DOCM files that actually have macros: word/vbaProject.bin and word/vbaData.xml (+ the relation pointing to the second from the first). Reviewed-on: https://gerrit.libreoffice.org/38360 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit 8a59b30bb1af55f7afd8b98e4b60234f98d84c76) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport9.cxx Change-Id: Iba24eea4c5bca8f743a53027c71ed2aae48f1934 2nd commit: Related: tdf#108269 DOCM filter: reuse oox code for VBA preservation With this, the project stream import is shared between DOCM and XLSM. Change-Id: I8fbffefc5acf28adea4875fa6bc4148a99b5ebef Reviewed-on: https://gerrit.libreoffice.org/38495 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit e4adb8d9e77bab353dda26375e11a6b7a456368f) 3rd commit: Related: tdf#108269 DOCM filter: reuse oox code for VBA data preservation Which means the DOCM-specific code to roundtrip VBA things (project, data) can be removed. The oox part has to be extended a bit, as at least for this DOCM bugdoc there is an XML relation of the binary data, while existing shared code assumed the full VBA project is just a single OLE blob. Reviewed-on: https://gerrit.libreoffice.org/38504 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit 0129c2cd9dd95355412b194c595f4b986403ba1e) Conflicts: writerfilter/inc/ooxml/OOXMLDocument.hxx writerfilter/source/ooxml/OOXMLDocumentImpl.hxx Change-Id: I4085e4dba24475e6fd555e5f34fe7ad0f305c57d Reviewed-on: https://gerrit.libreoffice.org/38558 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-06-23tdf#108682 DOCX import: fix <w:spacing w:line=...> for negative valuesMiklos Vajna
I didn't find UI in Word to create <w:spacing w:line="-260" w:lineRule="auto"/> the equivalent markup when you set line spacing to exactly 13pt for new documents is: <w:spacing w:line="260" w:lineRule="exact"/> The OOXML spec and Microsoft's implementer notes ([MS-OI29500]) is also pretty silent about what a negative value means. However, if this markup is converted to WW8 by Word, then the WW8 LPSD structure is like this (as presented by doc-dumper): <lspd type="LSPD" offset="5086"> <dyaLine value="0xfefc"/> <fMultLinespace value="0x1"/> </lspd> For the 0xfefc value the [MS-DOC] spec clearly states that means the type of the spacing is "exactly", with the value of 0x10000-0xfefc, i.e. the same 260 twips. Change-Id: I84b485d02dea49c610b6df2e06ccce03e1d29d21 Reviewed-on: https://gerrit.libreoffice.org/39091 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit f575f70b8303ba187f6989920281ff02e7a431c9) Reviewed-on: https://gerrit.libreoffice.org/39162 Reviewed-by: Andras Timar <andras.timar@collabora.com> Tested-by: Andras Timar <andras.timar@collabora.com>
2017-06-22Watermark: auto size in the RTFSzymon Kłos
When Watermark size is set to Auto in the MSO, the saved value is equal 1pt. Before this patch in this case Watermark was invisible due to small size. Change-Id: Ia2028a6547cf98dd31031305bcc5375625b83fe0 Reviewed-on: https://gerrit.libreoffice.org/38883 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-06-15Watermark: RTF font import and exportSzymon Kłos
* font size * font family * rotation * TextPath geometry - working transparency & color * revert TextBox export removed by mistake Change-Id: I3f6df86809ae57dc40c275652a96b19d2a3d7eba Reviewed-on: https://gerrit.libreoffice.org/38494 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit dd0df1c8a213ab6f0959145396bc273bf885af39) Signed-off-by: Andras Timar <andras.timar@collabora.com>
2017-06-10Watermark: RTF import / exportSzymon Kłos
* "wzName" should contain shape name * MS Word watermark has text inside the "gtextUNICODE" (do not create additional shptxt) Change-Id: I7929ec83a9219d6087d36ccbf6d7e735acf63722 Reviewed-on: https://gerrit.libreoffice.org/38219 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-06-10Avoid UBSan warning about negative double -> sal_uInt32 conversionStephan Bergmann
Since ea890b1d4bcd6dd59db9f52dce1609c020804e24 "tdf#108408: support unit specifications for ST_HpsMeasure", the OOXMLUniversalMeasureValue ctor is converting textual data to mnValue via intermediary double instead of sal_Int32, so textual data representing negative values now triggers UBSan warnings (e.g., "writerfilter/source/ooxml/OOXMLPropertySet.cxx:630:43: runtime error: -70 is outside the range of representable values of type 'unsigned int'" during CppunitTest_chart2_export; it appears that, while HpsMeasure may be documented to only cover positive values, TwipsMeasure may be negative). But OOXMLUniversalMeasureValue::mnValue is apparently only used in OOXMLUniversalMeasureValue::getInt, to return an int value, so just change its type. Change-Id: I44eabb78f09100c05cc9d1e79a739648f34ea743 (cherry picked from commit 600ec501bafc691d37078a0ed5b4ca8bf32340f1) Reviewed-on: https://gerrit.libreoffice.org/38632 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-09tdf#108408: support unit specifications for ST_HpsMeasureMike Kaganski
w:ST_HpsMeasure is defined in ECMA-376 5th ed. Part 1, 17.18.42 as This simple type specifies that its contents contain either: * A positive whole number, whose contents consist of a measurement in half-points (equivalent to 1/144th of an inch), or * A positive decimal number immediately followed by a unit identifier. ... This simple type is a union of the following types: * The ST_PositiveUniversalMeasure simple type (§22.9.2.12). * The ST_UnsignedDecimalNumber simple type (§22.9.2.16). This patch generalizes OOXMLUniversalMeasureValue to handle standard- defined units, and introduces two typedefed specifications: OOXMLTwipsMeasureValue (which is used where UniversalMeasure was previously used), and new OOXMLHpsMeasureValue. Unit test included. Reviewed-on: https://gerrit.libreoffice.org/38562 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit ea890b1d4bcd6dd59db9f52dce1609c020804e24) Change-Id: Iccc6d46f717cb618381baf89dfd3e4bbb844b4af Reviewed-on: https://gerrit.libreoffice.org/38591 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-09tdf#104450: Use Calibri; let LO to fallback to CarlitoMike Kaganski
Using Calibri will allow to keep originally intended font on round-trip. If Calibri is absent on a system, LO will fallback to Carlito for rendering, but keep original font intact. Reviewed-on: https://gerrit.libreoffice.org/38456 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit dd1ba90f6069b41e3f2c301809afefc6f63da710) Change-Id: I8f29bed29bc7f48912b2637053ff128ea904c7a1 Reviewed-on: https://gerrit.libreoffice.org/38590 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-06-09tdf#108350: Use Carlito for DOCX import by defaultMike Kaganski
In OOXML (i.e. Word since 2007), the default document font is Calibri 11 pt. If a document doesn't contain font information, we should assume our metric-compatible equivalent Carlito to provide best layout match. A unit test included. An existing unit test (testN766487) was corrected to match the font size that Word uses (11; was 12 which doesn't match Word's size). Reviewed-on: https://gerrit.libreoffice.org/38421 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit 5471a5585cba925bb0dcb2dc41e03ad563998166) Change-Id: I3040f235696282dc7a124cd83fb34a6d95a29a17 Reviewed-on: https://gerrit.libreoffice.org/38589 Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-05-30tdf#106953 RTF import: fix missing paragraph left marginMiklos Vajna
See commit 3915bf2dc877d5f1140798e24933db0f21386a4a (tdf#95376 DOCX import: fix incorrectly indented tab stops, 2016-01-26) for the various sources that can determine the paragraph indentation. In this case the problem was that too aggressive RTF style deduplication removed a direct indent, which then meant a fallback to the ind-from-num value, not to the ind-from-parastyle one. (cherry picked from commit f528f9499bd91b700c549575e88fa102cfffede9) Change-Id: I3b47b2bbeaaedf405baef24505d23dc49bd01865 Reviewed-on: https://gerrit.libreoffice.org/37670 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com> (cherry picked from commit 0022ae02cfea1c5d69d9f4fedeeeb7a30cc4184b)
2017-05-17tdf#107889 DOCX import: consider page breaks for multi-page floattablesMiklos Vajna
This is the DOCX equivalent of commit 6aba29576df7a2a40e54040d4dd09d94d6594741 (tdf#107773 DOC import: consider page breaks for multi-page floattables, 2017-05-11): a specific case where it's clearly superior to import a multi-page floating table as a multi-page one, rather than a floating one. Reviewed-on: https://gerrit.libreoffice.org/37683 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit 659c0227a50d298780d72902314e03df8824bc06) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport9.cxx writerfilter/source/dmapper/PropertyMap.cxx writerfilter/source/dmapper/PropertyMap.hxx Change-Id: I71a92d2b10e52e505665831caacad2948d22b4e1
2017-05-17writerfilter: default break type identified as _nextPageJustin Luth
Change-Id: I9247c75819425a97d19c95c48fbaf7a4f8d92c62 Reviewed-on: https://gerrit.libreoffice.org/35379 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Justin Luth <justin_luth@sil.org> (cherry picked from commit 541b377a94fb1247dbf4c39b5bcf55deb8e5ef60)
2017-05-17tdf#103931 writerfilter breaktype: same for implicit and explicitJustin Luth
MSWord normally does NOT specify "nextPage" for the sectionBreak, since that is the default type. That is imported as BreakType == -1. However, Writer ALWAYS exports the section type name, which of course is imported explicitly. **There is an import hack that treats the very first -1 section as continuous IF there are columns**. Since Writer explicitly defines the section type, these documents import differently. When Writer round-trips these types of files, they get totally messed up in Writer, although they look fine in Word. So, treat both implicit and explicit nextPage identically for bTreatAsContinuous during import. Another unit test demonstrated that headers/footers are lost when treating as continuous, so preventing that situation now also. This fix allows several import-only unit tests to round-trip. Reviewed-on: https://gerrit.libreoffice.org/35013 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Justin Luth <justin_luth@sil.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit 4605bd46984125a99b0e993b71efa6edb411699f) Conflicts: sw/qa/extras/ooxmlexport/ooxmlexport9.cxx Change-Id: I37fa861d82e8da564d28d8e9089fe0f2777650fb
2017-05-10tdf#104407 writerfilter: fix crash with null xRangePropertiesMichael Stahl
The m_xStartingRange is null at this point for whatever reason, and the block immediately above this one already checks xRangeProperties, so let's just do the same here. (Also IsNewDoc(), where the logic between PageDescName and PageNumberOffset presumably shouldn't differ?). (started to crash with abaf6bde4ee91c628bd55a7ec2e876a5d0ecff6e as previously that code was unreachable in RTF import) Change-Id: I20539c3a753ecea357e556ea556c3c26983ce1d1 (cherry picked from commit e4da2e5dfa9e462e0d9c23a1a60caf4b3ef2dc56) Reviewed-on: https://gerrit.libreoffice.org/37305 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com> (cherry picked from commit 8521f4c8fb08aa37912f73a73ba1a34c2ccc97ed)
2017-05-10AutoText: add only real AutoText entriesSzymon Kłos
* add only autoTxT gallery type * new test with other types of entries Change-Id: Ibf7751c73dcf3b6ebd69eec5f4931dbeaaf098c8 Reviewed-on: https://gerrit.libreoffice.org/37425 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Tested-by: Szymon Kłos <szymon.klos@collabora.com> (cherry picked from commit a470d16208a78ae6893d199b3b6bc77a8559b06a) Reviewed-on: https://gerrit.libreoffice.org/37460
2017-05-07tdf#107033 DOCX import: fix unexpected missing footnote separatorMiklos Vajna
Regression from commit 330b860205c7ba69dd6603f65324d0f89ad9cd5f (fdo#68787 DOCX import: handle when w:separator is missing for footnotes, 2013-09-04), the problem was footnote settings were modified also in case there were no footnotes at all in the document. Make the bug scenario and the original one working at the same time by touching footnote settings only in case there is at least one footnote in the current section. (cherry picked from commit e79ef12b7a904f17d4147fa409d055c12b70f952) Change-Id: I163d11769cbd97957662607fbedfba404181e002 Reviewed-on: https://gerrit.libreoffice.org/37228 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Michael Stahl <mstahl@redhat.com> (cherry picked from commit cc6a55d687581db1a174b2a7d01f8a62887b5e24)
2017-05-04AutoText: read names of entriesSzymon Kłos
+ extended model to parse <docPartPr> and <name> marks + names are inserted to the document before content of each entry + SwDOCXReader interprets first paragraph of each section as a name Change-Id: Ib7de152ba1c6bea4f4665f98d321019c3f68863e Reviewed-on: https://gerrit.libreoffice.org/37124 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
2017-05-04AutoText: Reading multiple entriesSzymon Kłos
+ each entry is placed in a separate section + extended model and dmapper to react on docPart mark Change-Id: I7e5213a09ae7352d1d09369bd0a209b6d4e18e82 Reviewed-on: https://gerrit.libreoffice.org/37107 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com>
2017-05-04AutoText: importing docx contentSzymon Kłos
- passing "ReadGlossaries" flag to the WriterFilter - if set - WriterFilter reads glossary document instead of the main content - updated model.xml to read docParts and docPart nodes - SwDOCXReader adds document content as an AutoText entry Change-Id: I9a0cc91c793d6accc8461e1c3aca791c5997d497 Reviewed-on: https://gerrit.libreoffice.org/36753 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Szymon Kłos <szymon.klos@collabora.com> Tested-by: Szymon Kłos <szymon.klos@collabora.com>
2017-05-04tdf#107104 DOCX drawingML import: fix invisible arrow shapeMiklos Vajna
This is the drawingML equivalent of commit 3d9ebded1358395ed81db7a63629b046aec2aeac (Misc improvements for docx VML import, 2010-10-06), which made sure that shapes are never invisible just because they have zero height or width. For this particular bugdoc the Word-produced WW8 equivalent width is 20 twips, but let's be consistent with the VML import and just round up to 1 mm100. Also fix two existing tests that wanted to test something else, but implicitly asserted that some shapes indeed have zero width/height. (cherry picked from commit e6e5a68f52f4e06b73f0ece3a3886f3bfc30f56d) Change-Id: I9600424520d0a3deecc711b44622eccc041a59da Reviewed-on: https://gerrit.libreoffice.org/36953 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Michael Stahl <mstahl@redhat.com> (cherry picked from commit 7d3baea4a726d6c0cf6cb0d6a8b2c83cef4f580d)
2017-04-24tdf#107116 RTF import: fix missing upper and lower borders around textMiklos Vajna
See commit 1be0a3fa9ebb22b607c54b47739d4467acfed259 (n#825305: writerfilter RTF import: override style properties like Word, 2014-06-17) for the context. Here the problem was that various details of the top border were removed during the style deduplication, but not the top border sprm itself. That was interpreted (correctly) by dmapper as "no border", rather than "inherit from style". (cherry picked from commit e9f0d8d02885eca619552b19eab30c1eade9e7ef) Change-Id: I3dec8df789fc7b75fccfff91ce66f457fecd2f6e Reviewed-on: https://gerrit.libreoffice.org/36692 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Michael Stahl <mstahl@redhat.com> (cherry picked from commit c8c90854506cc7f1c3d7084ab97c156aead003e2)
2017-04-18tdf#106970 DOCX import: don't collapse para auto space for different numsMiklos Vajna
Commit 1bf7f6a1a50ee9f24a3687240fe6ae390b905a6b (tdf#106690 DOCX import: fix automatic spacing before/after numbered para block, 2017-04-04) made sure that autospacing is only collapsed in case the adjacent text nodes both have a numbering rule. It turns out there is an additional condition: even if both text nodes have a numbering rule, do the collapsing only in case they have the same numbering rule. (cherry picked from commit e1c83d0514e6123faa50ad0a7aa6a9031b271c9a) Change-Id: Idb7a2b24d7eaa9094cc36f86b8a483045a33d028 Reviewed-on: https://gerrit.libreoffice.org/36510 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Michael Stahl <mstahl@redhat.com> (cherry picked from commit e57873156d3c04ecc34bb5f38b186ebe29567f0c)
2017-04-12tdf#106690 DOCX import: fix automatic spacing before/after numbered para blockMiklos Vajna
The context is text nodes with automatic before/after spacing and numbering rules set, like: A * B * C * D E The correct behavior seems to be (though I haven't found this explicitly written in the OOXML spec) to drop spacing between B and C and C and D, but not before B and not after D. Originally no spacing was dropped, then commit c486e875de7c8e845594f5043a37ee8800865782 (tdf#95031 DOCX import: auto spacing inside numbering means no spacing, 2016-10-18) removed spacing around all B/C/D. Fix the problem by checking the numbering rules and automatic after spacing of the previous paragraph, so spacing before B and after D is not removed. Change-Id: Icbdb36e31057ab0e8ac033888cf5cc7c52dad5d0 Reviewed-on: https://gerrit.libreoffice.org/36062 Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> Tested-by: Jenkins <ci@libreoffice.org> (cherry picked from commit 1bf7f6a1a50ee9f24a3687240fe6ae390b905a6b) Reviewed-on: https://gerrit.libreoffice.org/36142 Reviewed-by: Michael Stahl <mstahl@redhat.com> (cherry picked from commit 776839b8bfc6eed905ce97c6fe32af8deb8d1451)
2017-04-12tdf#106692 writerfilter: RTF import: fix \'0d in \leveltextMichael Stahl
It's not a newline but yet another one of those bizarre RTF-encodings. (regression from 10e733908038407791f9c14af2a86417cc4a653c) (cherry picked from commit 69b7204164945cfed385d58e64592ce1b17937d7) Reviewed-on: https://gerrit.libreoffice.org/36284 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk> (cherry picked from commit fd93d09a5b6226a8297b5dd995301d514ec7b8ca) Change-Id: I568050b031b95ac0b6ebfa1a0c39107e62f68bed