summaryrefslogtreecommitdiff
path: root/sax
AgeCommit message (Collapse)Author
2021-03-03Remove workaround now its fixed in AFL++ and oss-fuzz updatedStephan Bergmann
remove workaround for problem fixed by: https://github.com/AFLplusplus/AFLplusplus/commit/333509bb0a56be9bd2e236f0e2f37d4af2dd7d59> +# "better unicode support" for now: oss-fuzz updated: https://github.com/google/oss-fuzz/pull/5273 Change-Id: Id3f1790ef452ed7732032801fc4ec028e57443eb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111806 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-02-24ofz#30767 Build-FailureCaolán McNamara
afl++ build crashes for some obscure reason with attached bt. Tweaking the code like so gets it to squeak by and continue the build. clang-12: /usr/local/include/llvm/IR/Constants.h:661: llvm::StringRef llvm::ConstantDataSequential::getAsString() const: Assertion `isString() && "Not a string"' failed. Stack dump: 0. Program arguments: /usr/local/bin/clang-12 -cc1 -triple x86_64-unknown-linux-gnu -emit-obj --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name converter.cxx -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -fno-split-dwarf-inlining -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -ffunction-sections -fdata-sections -D BOOST_ERROR_CODE_HEADER_ONLY -D BOOST_SYSTEM_NO_DEPRECATED -D CPPU_ENV=gcc3 -D DISABLE_DYNLOADING -D LINUX -D NDEBUG -D OSL_DEBUG_LEVEL=0 -D UNIX -D UNX -D X86_64 -D _PTHREADS -D _REENTRANT -D SAX_DLLIMPLEMENTATION -D FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -D EXCEPTIONS_ON -D LIBO_INTERNAL_ONLY -D __AFL_HAVE_MANUAL_CONTROL=1 -D __AFL_COMPILER=1 -D FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=1 -D "__AFL_FUZZ_INIT()=int __afl_sharedmem_fuzzing = 1;extern unsigned int *__afl_fuzz_len;extern unsigned char *__afl_fuzz_ptr;unsigned char __afl_fuzz_alt[1048576];unsigned char *__afl_fuzz_alt_ptr = __afl_fuzz_alt;" -D "__AFL_COVERAGE()=int __afl_selective_coverage = 1;extern \"C\" void __afl_coverage_discard();extern \"C\" void __afl_coverage_skip();extern \"C\" void __afl_coverage_on();extern \"C\" void __afl_coverage_off();" -D "__AFL_COVERAGE_START_OFF()=int __afl_selective_coverage_start_off = 1;" -D __AFL_COVERAGE_ON()=__afl_coverage_on() -D __AFL_COVERAGE_OFF()=__afl_coverage_off() -D __AFL_COVERAGE_DISCARD()=__afl_coverage_discard() -D __AFL_COVERAGE_SKIP()=__afl_coverage_skip() -D "__AFL_FUZZ_TESTCASE_BUF=(__afl_fuzz_ptr ? __afl_fuzz_ptr : __afl_fuzz_alt_ptr)" -D "__AFL_FUZZ_TESTCASE_LEN=(__afl_fuzz_ptr ? *__afl_fuzz_len : (*__afl_fuzz_len = read(0, __afl_fuzz_alt_ptr, 1048576)) == 0xffffffff ? 0 : *__afl_fuzz_len)" -D "__AFL_LOOP(_A)=({ static volatile char *_B __attribute__((used)); _B = (char*)\"##SIG_AFL_PERSISTENT##\"; __attribute__((visibility(\"default\"))) int _L(unsigned int) __asm__(\"__afl_persistent_loop\"); _L(_A); })" -D "__AFL_INIT()=do { static volatile char *_A __attribute__((used)); _A = (char*)\"##SIG_AFL_DEFER_FORKSRV##\"; __attribute__((visibility(\"default\"))) void _I(void) __asm__(\"__afl_manual_init\"); _I(); } while (0)" -O1 -Wno-unused-command-line-argument -Wall -Wno-missing-braces -Wnon-virtual-dtor -Wendif-labels -Wextra -Wundef -Wunreachable-code -Wunused-macros -Wembedded-directive -Wdeprecated-copy-dtor -Wimplicit-fallthrough -Wunused-exception-parameter -Wrange-loop-analysis -Wshadow -Woverloaded-virtual -Wno-unused-command-line-argument -std=c++17 -fdeprecated-macro -ferror-limit 19 -fvisibility hidden -fvisibility-inlines-hidden -fsanitize=address -fsanitize-blacklist=/src/libreoffice/bin/sanitize-excludelist.txt -fsanitize-system-blacklist=/usr/local/lib/clang/12.0.0/share/asan_blacklist.txt -fsanitize-address-use-after-scope -fno-assume-sane-operator-new -funroll-loops -pthread -stack-protector 2 -fgnuc-version=4.2.1 -fno-inline -fcxx-exceptions -fexceptions -fcolor-diagnostics -load /src/aflplusplus/afl-llvm-dict2file.so -load /src/aflplusplus/cmplog-routines-pass.so -load /src/aflplusplus/cmplog-instructions-pass.so -load /src/aflplusplus/split-switches-pass.so -load /src/aflplusplus/SanitizerCoveragePCGUARD.so -faddrsig -x c++ converter-773998.cpp 1. <eof> parser at end of file 2. Per-module optimization passes 3. Running pass 'afl++ dict2file instrumentation pass' on module 'converter-773998.cpp'. #0 0x0000000001719ae3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/local/bin/clang-12+0x1719ae3) #1 0x0000000001717a4e llvm::sys::RunSignalHandlers() (/usr/local/bin/clang-12+0x1717a4e) #2 0x0000000001719f8f SignalHandler(int) (/usr/local/bin/clang-12+0x1719f8f) #3 0x00007f3e317b2980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980) #4 0x00007f3e306abfb7 raise (/lib/x86_64-linux-gnu/libc.so.6+0x3efb7) #5 0x00007f3e306ad921 abort (/lib/x86_64-linux-gnu/libc.so.6+0x40921) #6 0x00007f3e3069d48a (/lib/x86_64-linux-gnu/libc.so.6+0x3048a) #7 0x00007f3e3069d502 (/lib/x86_64-linux-gnu/libc.so.6+0x30502) #8 0x00007f3e30464810 (anonymous namespace)::AFLdict2filePass::runOnModule(llvm::Module&) /src/aflplusplus/instrumentation/afl-llvm-dict2file.so.cc:150:5 #9 0x00000000011d139f llvm::legacy::PassManagerImpl::run(llvm::Module&) (/usr/local/bin/clang-12+0x11d139f) #10 0x00000000018ef775 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/usr/local/bin/clang-12+0x18ef775) #11 0x00000000023c074f clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/usr/local/bin/clang-12+0x23c074f) #12 0x0000000002cbe554 clang::ParseAST(clang::Sema&, bool, bool) (/usr/local/bin/clang-12+0x2cbe554) #13 0x0000000001e3ccc7 clang::FrontendAction::Execute() (/usr/local/bin/clang-12+0x1e3ccc7) #14 0x0000000001dc7311 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/local/bin/clang-12+0x1dc7311) #15 0x0000000001ed2dfc clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/local/bin/clang-12+0x1ed2dfc) #16 0x000000000092166e cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/local/bin/clang-12+0x92166e) #17 0x000000000091ff77 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/usr/local/bin/clang-12+0x91ff77) #18 0x000000000091fdbb main (/usr/local/bin/clang-12+0x91fdbb) #19 0x00007f3e3068ebf7 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf7) #20 0x000000000091cd49 _start (/usr/local/bin/clang-12+0x91cd49) Change-Id: I4eab488ff09f9213489212e56ed636596be6ae89 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111477 Tested-by: Caolán McNamara <caolanm@redhat.com> Reviewed-by: Caolán McNamara <caolanm@redhat.com>
2021-02-22sax: document SAX_DISABLE_THREADSMiklos Vajna
Change-Id: I39c05bb3dac09b67b93693dd8f2a297f6eb28f52 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111344 Reviewed-by: Miklos Vajna <vmiklos@collabora.com> Tested-by: Jenkins
2021-02-21loplugin:refcounting in package..saxNoel
Change-Id: I83618f54a4117cd81d8626307716129a761e14c5 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111274 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2021-02-18loplugin:referencecasting in saxNoel
Change-Id: Ie7371b2c6ed340ce8417af03aa4f7b60890392ec Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111081 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2021-02-17tdf#39593: reduce copy/paste in Converter::convertDurationBayram Çiçek
Change-Id: I520e10ef96c677be9f80bba510fe9c89295d416c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/111008 Reviewed-by: Michael Stahl <michael.stahl@allotropia.de> Tested-by: Jenkins
2021-02-14Move unit conversion code to o3tl, and unify on that in more placesMike Kaganski
This also allows to easily add more units, both of length and for other unit categories. The conversion for "Line" unit (312 twip) is questionable. Corresponding entries in aImplFactor in vcl/source/control/field.cxx were inconsistent (45/11 in; 10/13 pc; 156/10 pt). They were added without explanation in commit c85db626029fd8a5e0dfcb312937279df32339a0. I haven't found a spec of the unit (https://en.wikipedia.org/wiki/Line_(unit) is not specific). I used the definition based on "by pt", "by mm/100", "by char" (they all were consistent); "by pc" seems inverted; "by twip" was half as much. This accepted conversion makes unit test for tdf#79236 pass. Change-Id: Iae5a21d915fa8e934a1f47f8ba9f6df03b79a9fd Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110839 Tested-by: Mike Kaganski <mike.kaganski@collabora.com> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2021-02-10pass FastAttributeList around by rtl::ReferenceNoel
Change-Id: I958a22f60975c74dfaeb8469b4c0cd3759d40130 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110653 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2021-02-08oox: prefix VML shapetype ids with _x0000_tMichael Stahl
Word 2013 refuses to even load a file that has a <v:shapetype id="shapetype_75"> on some form control shape, reporting a misleading error in a location far later when the top-level w:tbl that contains the shape ends. Using id="_x0000_t75" appears to work, so let's do that then. Couldn't find any documentation on why this is so. Change-Id: Ie22bb04244e24b00a1880544872ae8e281422405 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110493 Tested-by: Jenkins Reviewed-by: Michael Stahl <michael.stahl@allotropia.de>
2021-01-29loplugin:stringviewparam extend to new..Noel
O[U]StringBuffer methods Change-Id: I0ffbc33d54ae7c98b5652434f3370ee4f819f6f4 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110090 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2021-01-28simplify code, use more subView()Noel
Change-Id: I569c7f34acbdf8451cd5c9acf1abd334637072d1 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110051 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2021-01-27Improve loplugin:stringliteralvarStephan Bergmann
...to also consider O[U]String ctors taking pointer and length Change-Id: Iea5041634bfbf5054a1317701e30b56f72e940fb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/110025 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-01-20Fix typoAndrea Gelmini
Change-Id: I9edd52387417f8bb40646800beda7a3dca0b9abf Reviewed-on: https://gerrit.libreoffice.org/c/core/+/109657 Tested-by: Jenkins Reviewed-by: Julien Nabet <serval2412@yahoo.fr>
2021-01-19Simplify getFirstLineBreakdante
Change-Id: I0fcacd3f3deb5867ed91a7037b74fa364ebc4c80 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/109302 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-01-19Use customized xml entities on xmleport.dante
This will be mainly used on matml export for unicode characters. It will be used mostly for mathml. Change-Id: I59b96d44facbd01fa517317a0ae54d64d29b0a19 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108562 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-12-30Clang-format saxwriterdante
Change-Id: I4793d81e2ba3405b9ed07a2c5547572ed7e0bee6 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108425 Tested-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-29loplugin:stringviewparam: operator +Stephan Bergmann
Change-Id: I044dd21b63d7eb03224675584fa143009c6b6008 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108418 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-12-28ofz#28733 Direct-leakCaolán McNamara
free xmlEntityPtr the way desret_xmlEntityPtr does in libxml's testapi.c Change-Id: Ia809413c3d4e7b13e799e6c1a57e8abe61bf218d Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108415 Tested-by: Jenkins Reviewed-by: Caolán McNamara <caolanm@redhat.com>
2020-12-27Preparations for customized xml entities on exportdante
Change-Id: I8ad4af7e27ae5f8908f4c932242cb96abbf3de90 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108354 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-26Improve handle of custom XML entitiesdante
Since 7.1 hasn't been released yet there's still time to change it before having to be scared about backguards compatibility. This way: - It is more efficient than passing two arguments - On definition it is simpler since both are declarated on same point. So it is simpler to not loose sync between lists. - Code is less long. - Thanks to an idea proposed by Stephan Bergmann on other commit. Change-Id: I16305a304c98eb8d4e11507c7938002da546778b Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108028 Tested-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-22use string_view in ProcessAttributeNoel
Change-Id: I81feb01bf6823d1d8fb5a7da08490959484ef533 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108095 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-22reduce duplication in sax converterNoel
Change-Id: I05bfb50e81a84b5f3bb7749e85058f967cb4b4ea Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108094 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-20No longer need to worry about ambiguous operator== in loplugin:stringviewparamStephan Bergmann
...after 46c5de832868d2812448b2caace3eeaa9237b9f6 "make *String(string_view) constructors explicit" Change-Id: I6e884c762a2fc91f5dd6fbb197a596fd60f17cae Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108043 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-12-19make *String(string_view) constructors explicitNoel Grandin
to make it more obvious when we are constructing heap OUStrings code and potentially inadvertently throwing away performance. And fix a handful of places so revealed. Change-Id: I0cf390f78026f8a670aaab53424cd31510633051 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107923 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-19Proposition for solution for memory error in 106804dante
https://gerrit.libreoffice.org/c/core/+/106804 This needs to be merged in the 7.1, it corrects a memory leak introduced in this same version. Change-Id: Id3c3f86f88c32e631f0c414fbd7942aba2a91239 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107930 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-17use more string_view in sax::ConverterNoel
Change-Id: If8a9bba41e6b08583f64388d7b5581e616ec9066 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107873 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-17Sort custom entity names on fast parserdante
When there are lots of entities and a lot of cases to replace, it becomes a ballast to search the whole array. So in order to avoid it, now uses ordered search and stops when OUString order implies that it can't be further. The entity list is sorted before the parse by quick sort. Change-Id: I9c91338ad67ddea1c273e329542549a904a0e563 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107774 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-15use views to parse rather than allocating OUStringNoel
Change-Id: If0a848c64ce8077d1681661873629c83307cf8b2 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107736 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-11Adapt the remaining OUString functions to std string_viewStephan Bergmann
...for LIBO_INTERNAL_ONLY. These had been missed by 1b43cceaea2084a0489db68cd0113508f34b6643 "Make many OUString functions take std::u16string_view parameters" because they did not match the multi-overload pattern that was addressed there, but they nevertheless benefit from being changed just as well (witness e.g. the various resulting changes from copy() to subView()). This showed a conversion from OStringChar to std::string_view to be missing (while the corresponding conversion form OUStringChar to std::u16string_view was already present). The improvement to loplugin:stringadd became necessary to fix > [CPT] compilerplugins/clang/test/stringadd.cxx > error: 'error' diagnostics expected but not seen: > File ~/lo/core/compilerplugins/clang/test/stringadd.cxx Line 43 (directive at ~/lo/core/compilerplugins/clang/test/stringadd.cxx:42): simplify by merging with the preceding assignment [loplugin:stringadd] > File ~/lo/core/compilerplugins/clang/test/stringadd.cxx Line 61 (directive at ~/lo/core/compilerplugins/clang/test/stringadd.cxx:60): simplify by merging with the preceding assignment [loplugin:stringadd] > 2 errors generated. Change-Id: Ie40de0616a66e60e289c1af0ca60aed6f9ecc279 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107602 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-12-11FastParser.cxx changesdante
If custom entity list is empty, custon entities by unicode value have to keep working. Succefully loaded: <?xml version="1.0" encoding="UTF-8"?> <math xmlns="http://www.w3.org/1998/Math/MathML" display="block"> <semantics> <mi>&#x3C3;</mi> <mi>&#x221E;</mi> <mi>&sigma;</mi> <mi>&infin;</mi> </semantics> </math> Change-Id: I46cc5b04bd91d1aaadf3f99cb2079325bb0d08cf Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107498 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2020-12-09Begining of support for &entityname for mathml.dante
Change-Id: I03ce79ed74088db3c1f6c1f87d7a75160ff19a30 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107038 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-07Adding suppor for &#dddd; and &#xhhhh; on fastparser.dante
Change-Id: Iacbbe8a77532fe5034ceae286f50a74310f7d2ed Reviewed-on: https://gerrit.libreoffice.org/c/core/+/107036 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-12-01cid#1470375 Unrecoverable parse warningCaolán McNamara
and cid#1470366 Unrecoverable parse warning cid#1470365 Unrecoverable parse warning cid#1470361 Unrecoverable parse warning cid#1470360 Unrecoverable parse warning cid#1470367 Unrecoverable parse warning Change-Id: Ib0b5167de65d1a16438ba8f8c564b0b89d52e6d5 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106982 Tested-by: Jenkins Reviewed-by: Caolán McNamara <caolanm@redhat.com>
2020-12-01tdf#42949 Fix new IWYU warnings in directories s*Gabor Kelemen
Except recently checked sc, sd, svx, sw Found with bin/find-unneeded-includes Only removal proposals are dealt with here. Change-Id: Ice1b86628e4f22a39f307b9c5fa567b6ab9d5acb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106917 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-11-30loplugin:stringviewparam include comparisons with string literalsNoel
Change-Id: I8ba1214500dddaf413c506a4b82f43d63cda804b Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106559 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-29Preparing for mathml support of custom entity references.dante
This should be enough for the starmath mathml project. It can be reused from other modules for doing custom stuff. It keeps to minimum changes on generic modules. My current abilities don't allow me to go much beyond this approach. Change-Id: If7f157f8a71d6c3bff50fdbcd80bed23c92f40bb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106804 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-25use string_view for the parsing in sax utilsNoel
Change-Id: Ifd7430501318684f9999c90dd36c1ca965373947 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106499 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-24loplugin:stringviewparam extend to comparison operatorsNoel
which means that some call sites have to change to use unicode string literals i.e. u"foo" instead of "foo" Change-Id: Ie51c3adf56d343dd1d1710777f9d2a43ee66221c Reviewed-on: https://gerrit.libreoffice.org/c/core/+/106125 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-16tdf#138144 Form wizard fails to saveNoel
fallout from commit 3de38e95561ab7ca114d9f3307702ba89c4e3e9a Date: Tue Nov 10 19:20:06 2020 +0200 use fastparser in forms Change-Id: I4691786525132ef0cf98b6b177a2c022c4d7d032 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105932 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-11loplugin:stringviewNoel
Add new methods "subView" to O(U)String to return substring views of the underlying data. Add a clang plugin to warn when replacing existing calls to copy() would be better to use subView(). Change-Id: I03a5732431ce60808946f2ce2c923b22845689ca Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105420 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-11use fastparser in formsNoel Grandin
Change-Id: I7d09d64857e24267b4b4baddb563e28ceea92f2e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105560 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-11-08ofz#26944Noel Grandin
this actually a regression from commit 8c5ffecf1dbd3f93128910433da11d5315661680 Author: Noel <noelgrandin@gmail.com> Date: Fri Oct 23 15:12:22 2020 +0200 make SvXMLImport capable of mixing fast- and slow- contexts adhoc where I did not get the namespace handling right, but it only became obvious with commit 3940cf7d716f3e469f47d3c831a799e58edf2eb8 Date: Mon Nov 2 12:26:26 2020 +0200 drop the SvXMLExport::EndElement method.. Specifically, we have weird logic to treat some bad namespaces as good (who knows why), but that logic only exists for the slowparser path. With the dropping of EndElement(), we ended up with calls to SvXMLImport::startUnknownElement() calling SvXMLImportContext::StartElement(), but without a corresponding call to SvXMLImportContext::endFastElement. To make this work right, I copied the namespace aliasing code to FastParser. Change-Id: I00ecbf046feeaac6f2a789f801175dba40836f84 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/105441 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-10-23long->tools::Long in pyuno..sdNoel
Change-Id: I67c1218d225f49ea9ce789433283ab85275e39a5 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/104627 Tested-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-09-29tdf#136551 DOTX import: restore support for large XML attribute valuesMiklos Vajna
Regression from commit 82d08580e368afbc9d73da3613845a36a89b0a8c (switch saxparser from expat to libxml2, 2014-11-14), expat used to allow huge XML attribute values, while libxml2 defaults to rejecting values larger than 10MB. This looks like a sane limit, but the bugdoc has some fallback VML markup where the actual graphic content of the shape is base64-encoded in an XML attribute value. libxml2 has an XML_PARSE_HUGE flag to lift this limit, so use that. If this was not a problem with expat, then it should be no problem with libxml2, either. [ No testcase, adding a 10MB test document to the repo is not preferred. ] Change-Id: Ifcd0ce52d3cb95bef36c58aa073bb59bc07490d6 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/103567 Tested-by: Jenkins Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
2020-09-16Turn OUStringLiteral into a consteval'ed, static-refcound rtl_uStringStephan Bergmann
...from which an OUString can cheaply be instantiated. This is the OUString equivalent of 4b9e440c51be3e40326bc90c33ae69885bfb51e4 "Turn OStringLiteral into a consteval'ed, static-refcound rtl_String". Most remarks about that commit apply here too (this commit is just substantially bigger and a bit more complicated because there were so much more uses of OUStringLiteral than of OStringLiteral): The one downside is that OUStringLiteral now needs to be a template abstracting over the string length. But any uses for which that is a problem (e.g., as the element type of a container that would no longer be homogeneous, or in the signature of a function that shall not be turned into a template for one reason or another) can be replaced with std::u16string_view, without loss of efficiency compared to the original OUStringLiteral, and without loss of expressivity. The new OUStringLiteral ctor code would probably not be very efficient if it were ever executed at runtime, but it is intended to be only executed at compile time. Where available, C++20 "consteval" is used to statically ensure that. The intended use of the new OUStringLiteral is in all cases where an object that shall itself not be an OUString (e.g., because it shall be a global static variable for which the OUString ctor/dtor would be detrimental at library load/unload) must be converted to an OUString instance in at least one place. Other string literal abstractions could use std::u16string_view (or just plain char16_t const[N]), but interestingly OUStringLiteral might be more efficient than constexpr std::u16string_view even for such cases, as it should not need any relocations at library load time. For now, no existing uses of OUStringLiteral have been changed to some other abstraction (unless technically necessary as discussed above), and no additional places that would benefit from OUStringLiteral have been changed to use it. Global constexpr OUStringLiteral variables defined in an included file would be somewhat suboptimal, as each translation unit that uses them would create its own, unshared instance. The envisioned solution is to turn them into static data members of some class (and there may be a loplugin coming to find and fix affected places). Another approach that has been taken here in a few cases where such variables were only used in one .cxx anyway is to move their definitions from the .hxx into that one .cxx (in turn causing some files to become empty and get removed completely)---which also silenced some GCC -Werror=unused-variable if a variable from a .hxx was not used in some .cxx including it. To keep individual commits reasonably manageable, some consumers of OUStringLiteral in rtl/ustrbuf.hxx and rtl/ustring.hxx are left in a somewhat odd state for now, where they don't take advantage of OUStringLiteral's equivalence to rtl_uString, but just keep extracting its contents and copy it elsewhere. In follow-up commits, those consumers should be changed appropriately, making them treat OUStringLiteral like an rtl_uString or dropping the OUStringLiteral overload in favor of an existing (and cheap to use now) OUString overload, etc. In a similar vein, comparison operators between OUString and std::u16string_view have been added to the existing plethora of comparison operator overloads. It would be nice to eventually consolidate them, esp. with the overloads taking OUStringLiteral and/or char16_t const[N] string literals, but that appears tricky to get right without introducing new ambiguities. Also, a handful of places across the code base use comparisons between OUString and OUStringNumber, which are now ambiguous (converting the OUStringNumber to either OUString or std::u16string_view). For simplicity, those few places have manually been fixed for now by adding explicit conversion to std::u16string_view. Also some compilerplugins code needed to be adapted, and some of the compilerplugins/test cases have become irrelevant (and have been removed), as the tested code would no longer compile in the first place. sal/qa/rtl/strings/test_oustring_concat.cxx documents a workaround for GCC bug <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96878> "Failed class template argument deduction in unevaluated, parenthesized context". That place, as well as uses of OUStringLiteral in extensions/source/abpilot/fieldmappingimpl.cxx and i18npool/source/localedata/localedata.cxx, which have been replaced with OUString::Concat (and which is arguably a better choice, anyway), also caused failures with at least Clang 5.0.2 (but would not have caused failures with at least recent Clang 12 trunk, so appear to be bugs in Clang that have meanwhile been fixed). Change-Id: I34174462a28f2000cfeb2d219ffd533a767920b8 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/102222 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-09-03Fix crashtest fdo77855.odtNoel Grandin
regression from commit commit 9814c1f2edf56ecc0f31001db9234ef335488879 use fastparser in SvXMLPropertySetContext subclasses and add some asserts to find the problems earlier. Change-Id: Ief64f813f2ef7ec005f682713dadb1be47cbcd15 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/101998 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-08-31Simplify code with std::string_viewStephan Bergmann
Change-Id: Ic296bf6abe621e702fa47378ac4c3cdf9f73ba27 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/101732 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-08-31Fix typo in codeAndrea Gelmini
It passed "make check" on Linux Change-Id: I86123dbc2052653aaf1d5c3a6fafb554c0b9a7fb Reviewed-on: https://gerrit.libreoffice.org/c/core/+/101600 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2020-08-28Change OUStringLiteral from char[] to char16_t[]Stephan Bergmann
This is a prerequisite for making conversion from OUStringLiteral to OUString more efficient at least for C++20 (by replacing its internals with a constexpr- generated sal_uString-compatible layout with a SAL_STRING_STATIC_FLAG refCount, conditionally for C++20 for now). For a configure-wise bare-bones build on Linux, size reported by `du -bs instdir` grew by 118792 bytes from 1155636636 to 1155755428. In most places just a u"..." string literal prefix had to be added. In some places char const a[] = "..."; variables have been changed to char16_t, and a few places required even further changes to code (which prompted the addition of include/o3tl/string_view.hxx helper function o3tl::equalsIgnoreAsciiCase and the additional OUString::createFromAscii overload). For all uses of macros expanding to string literals, the relevant uses have been rewritten as u"" MACRO instead of changing the macro definitions. It should be possible to change at least some of those macro definitions (and drop the u"" from their call sites) in follow-up commits. Change-Id: Iec4ef1a057d412d22443312d40c6a8a290dc6144 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/101483 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-08-13loplugin:stringstatic also look for local staticsNoel Grandin
Add some API to O*StringLiteral, to make it easier to use in some places that were using O*String Change-Id: I1fb93bd47ac2065c9220d509aad3f4320326d99e Reviewed-on: https://gerrit.libreoffice.org/c/core/+/100270 Tested-by: Jenkins Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>