diff options
author | Kevin Suo <suokunlong@126.com> | 2022-10-11 10:04:16 +0800 |
---|---|---|
committer | Thorsten Behrens <thorsten.behrens@allotropia.de> | 2022-10-13 21:38:12 +0200 |
commit | 69e9925ded584113e52f84ef0ed7c224079fa061 (patch) | |
tree | 1af5c8e9bedbceedd7b4ddefb67c37a79462aaa0 /i18npool | |
parent | 3442a46995f967dbd4c99797f7f13794912f0f58 (diff) |
sdext.pdfimport: resolves tdf#104597: RTL script text runs are reversed
For the simple Arabic string: ٱلسَّلَامُ عَلَيْك, the xpdfimport binary generates the
follwing (drawchar) sequences:
كَ
يْ
لَ
عَ
مُ
ا
لَ
سَّ
ل
ٱ
(i.e., in reversed order, one character by one character).
Before this patch, after pdfimport the text shows up as لَسَّلٱ كَيْلَعَ مُا, which is reversed.
It was surposed to combine these characters into text frames in
DrawXmlOptimizer::optimizeTextElements(Element& rParent) (sdext/source/pdfimport/\
tree/drawtreevisiting.cxx:677), but actually it was not combined successfully there.
The single characters were then passed to sdext/source/pdfimport/tree/drawtreevisiting\
.cxx:105, one by one, in the hope that the strings could be mirrored. The mirroring
failed, because one single character, even after mirroring, always equals itself.
The DrawXmlOptimizer::optimizeTextElements failed to combine the characters into
one text frame, because the condition:
(rCurGC.Transformation == rNextGC.Transformation || notTransformed(rNextGC))
would never be true, as at least its horizontal position is different. A further analysis
indicates that we do not need to check the transformation here at all, as this is an
optimizer for a TextElement and in case a character is transformed then it would already
be in a different draw element (thus will never be combined with this TextElement).
After the fix of DrawXmlOptimizer::optimizeTextElements which now successfully
combines the characters, there is another issue in the old PDFIProcessor::mirrorString
function. It seems to mirror the characters within a word, but if a string contains
two words, then the two words are not successfully switched (e.g. for string "abc def"
it produces "cba fed" rather than "fed cba"), which is not suitable for
the case of RTL which requires all the characters been reversed. Fix this by using
comphelper::string::reverseString.
Change-Id: Ifa210836f1d6666dd56205ff0d243adfb4114794
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/141231
Tested-by: Jenkins
Reviewed-by: Thorsten Behrens <thorsten.behrens@allotropia.de>
Diffstat (limited to 'i18npool')
0 files changed, 0 insertions, 0 deletions