summaryrefslogtreecommitdiff
path: root/external/clucene/patches
AgeCommit message (Collapse)Author
2023-12-13reprobuild: don't write timestamps to clucene index filesThorsten Behrens
Our embedded clucene by default write a random current-time millisecond value into version fields, in an attempt to randomise. Clearly this is not needed for our static help, and it also prevents builds from being reproducible. Change-Id: I011388b5bc72b5d86bc1900f5439036ede60c020 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/158845 Tested-by: Jenkins Reviewed-by: Thorsten Behrens <thorsten.behrens@allotropia.de>
2023-12-11external/clucene: operation between different enumeration typesStephan Bergmann
> workdir/UnpackedTarball/clucene/src/core/CLucene/index/FieldsReader.cpp:233:58: error: invalid bitwise operation between different enumeration types ('lucene::document::Field::Store' and 'lucene::document::Field::Index') > 233 | f = _CLNEW LazyField(this, fi->name, Field::STORE_YES | getIndexType(fi, tokenize) | getTermVectorType(fi), length, pointer); > | ~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ as reported now with --with-latest-c++ (i.e., in C++26 mode) by Clang 18 trunk since <https://github.com/llvm/llvm-project/commit/1cbd52f791d3f088246526c0801634edb65cee31> "[Clang] Implement P2864R2 Remove Deprecated Arithmetic Conversion on Enumerations (#73105)" Change-Id: I2d48298bc64e05271ee5c33255d7d57fed6221cf Reviewed-on: https://gerrit.libreoffice.org/c/core/+/160549 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <stephan.bergmann@allotropia.de>
2023-09-15tdf#157254: CLucene: fix pure virtual call in destructorMike Kaganski
When HelpIndexer::indexDocuments creates lucene::index::IndexWriter with a long path like C:\lo\src\build\workdir\longPathTest_123456789012345678901234567890123456789012345678901234567890\instdir\program\..\program\..\user\extensions\bundled\registry\com.sun.star.comp.deployment.help.PackageRegistryBackend\lu149121qyy8a.tmp\da\help.idxl then CLucene's FSDirectory::FSIndexOutput::FSIndexOutput may fail and throw, which would unwind, and call FSIndexOutput destructor, then procede to inherited BufferedIndexOutput destructor, which calls close(), which calls flush(), which finally calls flushBuffer; and that one was pure virtual in BufferedIndexOutput, which meant, that in BufferedIndexOutput destructor, that was a pure virtual function call, crashing the process. Patch CLucene to have a default implementation of the function, usable in its destructor. Change-Id: I6f84c8cf2bd24b9bb92a71da485089ebf832530a Reviewed-on: https://gerrit.libreoffice.org/c/core/+/156944 Tested-by: Jenkins Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2023-07-17external/clucene: More uses of obsolete std::binary_functionStephan Bergmann
...as seen at least when building against VS 2022 Preview 17.7.0 Preview 3.0 and --with-latest-c++, > workdir\UnpackedTarball\clucene\src\core\CLucene/util/_Arrays.h(128): error C2039: 'binary_function': is not a member of 'std' > C:\PROGRA~1\MICROS~3\2022\Preview\VC\Tools\MSVC\1437~1.328\Include\vector(26): note: see declaration of 'std' > workdir\UnpackedTarball\clucene\src\core\CLucene/util/_Arrays.h(153): note: see reference to class template instantiation 'lucene::util::CLListEquals<_kt,_comparator,class1,class2>' being compiled etc. Change-Id: Icea14fe0c0ad85501367ac6c81a3b8aada595383 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/154551 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2023-01-27Remove support for AIXStephan Bergmann
As discussed in the mailing list thread starting at <https://lists.freedesktop.org/archives/libreoffice/2023-January/089808.html> "Plan to remove dead C++ UNO bridge implementations (bridges/source/cpp_uno/*)", the bridge implementation at bridges/source/cpp_uno/gcc3_aix_powerpc is apparently dead and should thus be removed. However, that was the only bridge implementation for AIX, which implies that support for the AIX platform as a whole is dead and should thus be removed. Change-Id: I96de3f7f97d4fd770ff78256f0ea435383688be9 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/146057 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-11-09external/clucene: Remove unnecessary uses of obsolete std::binary_functionStephan Bergmann
...which has been removed from C++17. libc++ and libstdc++ still silently support it, but MSVC needs an explicit -D_HAS_AUTO_PTR_ETC, and this change is a prerequisite to drop that global define again from solenv/gbuild/platform/com_MSC_defs.mk (it had been added there with 61c88ae6945c241f5f2aeb844eeca0776b487132 "gbuild: always compile as C++17 with MSVC 2017", but code including external/clucene, like helpcompiler/source/LuceneHelper.cxx, appears to be the only code relying on that global define) Change-Id: I512d56f833c516dba3874cb0b4ef5190a88d3faf Reviewed-on: https://gerrit.libreoffice.org/c/core/+/124900 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-08-19external/clucene: Avoid std::string(nullptr) constructionStephan Bergmann
The relevant constructor is defined as deleted since incorporating <http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2166r1.html> "A Proposal to Prohibit std::basic_string and std::basic_string_view construction from nullptr" into the upcoming C++23, and has caused undefined behavior in prior versions (see the referenced document for details). That caused > workdir/UnpackedTarball/clucene/src/core/CLucene/index/SegmentInfos.cpp:361:13: error: conversion function from 'long' to 'std::string' (aka 'basic_string<char, char_traits<char>, allocator<char>>') invokes a deleted function > return NULL; > ^~~~ > ~/llvm/inst/lib/clang/14.0.0/include/stddef.h:84:18: note: expanded from macro 'NULL' > # define NULL __null > ^~~~~~ > ~/llvm/inst/bin/../include/c++/v1/string:849:5: note: 'basic_string' has been explicitly marked deleted here > basic_string(nullptr_t) = delete; > ^ at least when building --with-latest-c++ against recent libc++ 14 trunk (on macOS). (There might be a chance that the CLucene code naively relied on SegmentInfo::getDelFileName actually returning a std::string for which c_str() would return null at least at some of the call sites, which I did not inspect in detail. However, this would unlikely have worked in the past anyway, as it is undefined behavior and at least contemporary libstdc++ throws a std::logic_error when constructing a std::string from null, and at least a full `make check` with this fix applied built fine for me.) Change-Id: I2b8cf96b089848d666ec37aa7ee0deacc4798d35 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/120745 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2021-01-06external/clucene: Fix MSVC /Zc:strictStringsStephan Bergmann
...which is apparently enabled at least in MSVC 2019 16.8.3 when building with --with-latest-c++ (i.e., /std:c++latest): > C:/lo/core/workdir/UnpackedTarball/clucene/src/contribs-lib/CLucene/analysis/PorterStemmer.cpp(124): error C2664: 'bool lucene::analysis::PorterStemmer::ends(TCHAR *)': cannot convert argument 1 from 'const wchar_t [5]' to 'TCHAR *' > C:/lo/core/workdir/UnpackedTarball/clucene/src/contribs-lib/CLucene/analysis/PorterStemmer.cpp(124): note: Conversion from string literal loses const qualifier (see /Zc:strictStrings) > C:/lo/core/workdir/UnpackedTarball/clucene/src/contribs-lib/CLucene/analysis/PorterStemmer.cpp(97): note: see declaration of 'lucene::analysis::PorterStemmer::ends' etc. (and which is not silenced by gb_Library_set_warnings_disabled in external/clucene/Library_clucene.mk, unlike the corresponding Clang/GCC -Wwrite-strings) Change-Id: Id3c8eefa4658bf942de6c8ae9b219212eba79995 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/108840 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-06-17external/clucene: Adapt to C++20 CWG2237Stephan Bergmann
...<http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#2237> "Can a template-id name a constructor?", as implemented by GCC 11 trunk since <https://gcc.gnu.org/git/?p=gcc.git;a=commit; h=4b38d56dbac6742b038551a36ec80200313123a1> "c++: C++20 DR 2237, disallow simple-template-id in cdtor." Change-Id: I507fc5bde20fdf09b4e31a3db8a7554a473f1a9f Reviewed-on: https://gerrit.libreoffice.org/c/core/+/96549 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2020-04-23external/clucene: Avoid heap-buffer-overflowStephan Bergmann
...as seen during a --with-lang=ALL build with ASan on Linux: > [XHC] nlpsolver ja > ================================================================= > ==51396==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x62100000ed00 at pc 0x7fe425640f53 bp 0x7ffd6a0cc900 sp 0x7ffd6a0cc8f8 > READ of size 4 at 0x62100000ed00 thread T0 > #0 in lucene::analysis::cjk::CJKTokenizer::next(lucene::analysis::Token*) at workdir/UnpackedTarball/clucene/src/contribs-lib/CLucene/analysis/cjk/CJKAnalyzer.cpp:70:19 > #1 in lucene::index::DocumentsWriter::ThreadState::FieldData::invertField(lucene::document::Field*, lucene::analysis::Analyzer*, int) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/DocumentsWriterThreadState.cpp:901:32 > #2 in lucene::index::DocumentsWriter::ThreadState::FieldData::processField(lucene::analysis::Analyzer*) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/DocumentsWriterThreadState.cpp:798:9 > #3 in lucene::index::DocumentsWriter::ThreadState::processDocument(lucene::analysis::Analyzer*) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/DocumentsWriterThreadState.cpp:557:24 > #4 in lucene::index::DocumentsWriter::updateDocument(lucene::document::Document*, lucene::analysis::Analyzer*, lucene::index::Term*) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/DocumentsWriter.cpp:946:16 > #5 in lucene::index::DocumentsWriter::addDocument(lucene::document::Document*, lucene::analysis::Analyzer*) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/DocumentsWriter.cpp:930:10 > #6 in lucene::index::IndexWriter::addDocument(lucene::document::Document*, lucene::analysis::Analyzer*) at workdir/UnpackedTarball/clucene/src/core/CLucene/index/IndexWriter.cpp:681:28 > #7 in HelpIndexer::indexDocuments() at helpcompiler/source/HelpIndexer.cxx:66:20 > #8 in main at helpcompiler/source/HelpIndexer_main.cxx:79:22 > 0x62100000ed00 is located 0 bytes to the right of 4096-byte region [0x62100000dd00,0x62100000ed00) > allocated by thread T0 here: > #0 in realloc at /data/sbergman/github.com/llvm/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 > #1 in lucene::util::StreamBuffer<wchar_t>::setSize(int) at workdir/UnpackedTarball/clucene/src/core/CLucene/util/_streambuffer.h:114:17 > #2 in lucene::util::StreamBuffer<wchar_t>::makeSpace(int) at workdir/UnpackedTarball/clucene/src/core/CLucene/util/_streambuffer.h:150:5 > #3 in lucene::util::BufferedStreamImpl<wchar_t>::setMinBufSize(int) at workdir/UnpackedTarball/clucene/src/core/CLucene/util/_bufferedstream.h:69:16 > #4 in lucene::util::SimpleInputStreamReader::Internal::JStreamsBuffer::JStreamsBuffer(lucene::util::CLStream<signed char>*, int) at workdir/UnpackedTarball/clucene/src/core/CLucene/util/Reader.cpp:375:6 Note that this is not a proper fix, which would need to properly detect surrogate pairs split across buffer boundaries. But for one the comment says "however, gunichartables doesn't seem to classify any of the surrogates as alpha, so they are skipped anyway", and for another the behavior until now was to replace the high surrogate with soemthing that was likely garbage and leave the low surrogate at the start of the next buffer (if any) alone, so leaving both surrogates alone is likely at least no worse behavior. Change-Id: Ib6f6f1bc20ef8efe0418bf2e715783c8555068de Reviewed-on: https://gerrit.libreoffice.org/c/core/+/92792 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2019-12-03external/clucene: Adapt to C++20 deleted ostream << for non-plain char typesStephan Bergmann
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1423r3.html> "char8_t backward compatibility remediation", as implemented now by <https://gcc.gnu.org/ git/?p=gcc.git;a=commit;h=0c5b35933e5b150df0ab487efb2f11ef5685f713> "libstdc++: P1423R3 char8_t remediation (2/4)" for -std=c++2a, deletes operator << overloads that would print a pointer rather than a (presumably expected) string. So this infoStream output appears to have always been broken (the strings use TCHAR, which appears to unconditionally be a typedef for wchar_t, see workdir/UnpackedTarball/clucene/src/shared/CLucene/clucene-config.h), and appears to be just of informative nature, so just simplify it to not try to print any problematic parts. Change-Id: Ie9f8edb03aff461a15718a0c025af57004aba0a9 Reviewed-on: https://gerrit.libreoffice.org/84320 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2016-01-21Fix for Jenkins Gerrit Mac buildsStephan Bergmann
...which choke on #pragma GCC diagnostic ignored "-Wpragmas" Change-Id: I40100b43078320b79cb9e3d4e3fb369db0bed9fe
2016-01-19Silence -Werror,-Wunknown-pragmasStephan Bergmann
Change-Id: If726008f6755db59b01784ad6b479bbfe2d23e96
2016-01-19external/clucene: Silence -Werror=misleading-indentation (GCC 6)Stephan Bergmann
Change-Id: I9a067605f7c477f4e057338577a437cda7f2aa3d
2015-12-02external/clucene: Use warning-supression pragmas for clang-cl, tooStephan Bergmann
Change-Id: I23da54974f39da5fccb619d6fa68eff38e70f5a5
2015-11-18No more need to include config_global.hStephan Bergmann
...after 3b59dbbffdb73e48f9e2398bb1eecc24e3d95e13 "remove HAVE_GCC_PRAGMA_DIAGNOSTIC_SCOPE check and macro" Change-Id: I0e9f3c15d48affe104dd6b5df9828ef5e62dfa88
2015-07-28Fix clucene on MSVC 14.0David Ostrovsky
Change-Id: I225d9c5eb1d9c9851b3f64f7c654cfede6297933 Reviewed-on: https://gerrit.libreoffice.org/17339 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Michael Stahl <mstahl@redhat.com>
2014-10-02remove HAVE_GCC_PRAGMA_DIAGNOSTIC_SCOPE check and macroMichael Stahl
This is supported in GCC 4.6.0 already: https://gcc.gnu.org/onlinedocs/gcc-4.6.0/gcc/Diagnostic-Pragmas.html Change-Id: I2f67e588eea3a323a2e9c81e39e56ab2e715a817
2014-05-21external/clucene: Avoid InitOrderFiascoStephan Bergmann
...as reported by AddressSanitizer, where src/core/CLucene/index/IndexWriter.cpp initializes IndexWriter::MAX_TERM_LENGTH with the value of DocumentsWriter::MAX_TERM_LENGTH before the latter is initialized in src/core/CLucene/index/DocumentsWriter.cpp. But turns out that IndexWriter::MAX_TERM_LENGTH is completely unused. Change-Id: Ica01186584ec05a989a13dc58823f4751e8724e2
2014-05-08CLucene: Helgrind reported "pthread_mutex_destroy of a locked mutex"Stephan Bergmann
> pthread_mutex_destroy (/usr/src/debug/valgrind-3.9.0/helgrind/hg_intercepts.c:478) > lucene::util::mutex_thread::~mutex_thread() (workdir/UnpackedTarball/clucene/src/shared/CLucene/config/threads.cpp:179) > lucene::store::FSDirectory::FSIndexInput::close() (workdir/UnpackedTarball/clucene/src/core/CLucene/store/FSDirectory.cpp:225) > lucene::index::SegmentInfos::read(lucene::store::Directory*, char const*) (workdir/UnpackedTarball/clucene/src/core/CLucene/index/SegmentInfos.cpp:770) > lucene::index::IndexFileDeleter::IndexFileDeleter(lucene::store::Directory*, lucene::index::IndexDeletionPolicy*, lucene::index::SegmentInfos*, std::ostream*, lucene::index::DocumentsWriter*) (workdir/UnpackedTarball/clucene/src/core/CLucene/index/IndexFileDeleter.cpp:149) > lucene::index::IndexWriter::init(lucene::store::Directory*, lucene::analysis::Analyzer*, bool, bool, lucene::index::IndexDeletionPolicy*, bool) (workdir/UnpackedTarball/clucene/src/core/CLucene/index/IndexWriter.cpp:262) > lucene::index::IndexWriter::IndexWriter(char const*, lucene::analysis::Analyzer*, bool) (workdir/UnpackedTarball/clucene/src/core/CLucene/index/IndexWriter.cpp:158) > HelpIndexer::indexDocuments() (helpcompiler/source/HelpIndexer.cxx:55) Change-Id: I19cb9bd49b339d206a624c1f1d3dacdd909f4e25
2014-04-11CLucene: Some trivial GCC -fsanitize=undefined fixesStephan Bergmann
Change-Id: I40132f735eabbead0a1f16d44dbd8878b03902ce
2013-11-14external/clucene: -Werror,-Wunused-parameterStephan Bergmann
Change-Id: Iedb2d7c62f6498691bffd0beb529e479d62d004e
2013-11-14clucene: stop using #pragma GCC system_headerMichael Stahl
... it breaks dependency generation. Change-Id: I992e47ecea697617820358f711b7a6408fdabbe3
2013-10-17fdo#70393: move clucene to a subdir of externalKhaled Hosny
Change-Id: Ia9b7b18526119e29e21eb315d84d099861e15ea0 Reviewed-on: https://gerrit.libreoffice.org/6285 Reviewed-by: Björn Michaelsen <bjoern.michaelsen@canonical.com> Tested-by: Björn Michaelsen <bjoern.michaelsen@canonical.com>