diff options
author | Hossein <hossein@libreoffice.org> | 2022-01-04 21:12:14 +0100 |
---|---|---|
committer | Adolfo Jayme Barrientos <fitojb@ubuntu.com> | 2022-01-09 13:22:00 +0100 |
commit | 139ad1049ca65b279fe5e0b085bf2af039b62e19 (patch) | |
tree | 7b798fcb54cf6903fedf2d0e85ed70bf295ef4e1 /sw/source | |
parent | 4f76a8329a8cc31fac1e4a9ee6f4658a6802b854 (diff) |
tdf#146084 Don't warn for languages without hyphenation
Upon opening a Writer document containing some languages that do not
use hyphen, an alert is created with the text:
'Missing hyphenation data Please install the hyphenation package for
locale "ab_CD".'
in which 'ab_CD' is the locale.
This patch removes the warning for these languages, that do not use
hyphenation:
* Arabic script languages (except Uighur)
+ Persian (Farsi)
+ Kashmiri
+ Kurdish (Central Kurdish and Southern Kurdish with Arabic script)
+ Punjabi
+ Sindhi
+ Malai
+ Somali
+ Swahili
+ Urdu
"Words are not hyphenated in Arabic language text, however hyphenation
is possible for Uighur text written in the Arabic script"
https://www.w3.org/International/i18n-tests/results/word-break-shaping
The list from MS documents is lenghty, but some of the languages are
were not available in LibreOffice, so they are ommited:
https://docs.microsoft.com/en-us/typography/script-development/arabic
There were languages like Hausa and Kanuri from Nigeria that use both
Latin and Arabic script, but only Latin script was listed in the
LibreOffice languages, so they were also ommited.
* CJK languages
+ Japanese
+ Korean
+ Chinese
+ Yue Chinese
"CJK languages differ from European languages in that there are no
hyphenation rules"
https://tug.org/TUGboat/tb25-0/cho.pdf
* Vietnamese
"In Vietnamese all words consist of single syllables, so they are
often very short; hyphenation is not allowed at all."
https://tug.org/TUGboat/tb29-1/tb91thanh-vntex.pdf
Hyphenation is declined in Vietnamese orthography since 1975
https://www.quora.com/When-did-hyphenation-decline-in-Vietnamese-orthography
The fix for Japanese (tdf#143422) was previously done in:
53d5555f13371252874ec962dee4643168d26780 and the functionality is
preserverd with the current patch.
An alternate approach would be adding all the unicode scripts,
specifying the script for each langauge, and decide upon the script
(mostly) and not (only) the language.
More information about the hyphenation usage of many scripts can be
found in:
https://r12a.github.io/scripts/
This is the list of Unicode scripts:
https://unicode.org/standard/supported.html
https://en.wikipedia.org/wiki/Script_(Unicode)#List_of_scripts_in_Unicode
Change-Id: I7d2b4ee55a0893d1f0d1f9cd3b7cc037a49589b6
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/126435
Tested-by: Jenkins
Reviewed-by: Eike Rathke <erack@redhat.com>
(cherry picked from commit 151c56ed547490a99d912524c0e56b5d6d4a1939)
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/128082
Reviewed-by: Adolfo Jayme Barrientos <fitojb@ubuntu.com>
Diffstat (limited to 'sw/source')
-rw-r--r-- | sw/source/core/text/inftxt.cxx | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/sw/source/core/text/inftxt.cxx b/sw/source/core/text/inftxt.cxx index 14eb8b13c11d..a5e588ebad0b 100644 --- a/sw/source/core/text/inftxt.cxx +++ b/sw/source/core/text/inftxt.cxx @@ -66,6 +66,7 @@ #include <vcl/gdimtf.hxx> #include <vcl/virdev.hxx> #include <vcl/gradient.hxx> +#include <i18nlangtag/mslangid.hxx> using namespace ::com::sun::star; using namespace ::com::sun::star::linguistic2; @@ -1446,7 +1447,8 @@ bool SwTextFormatInfo::IsHyphenate() const LanguageType eTmp = GetFont()->GetLanguage(); // TODO: check for more ideographic langs w/o hyphenation as a concept - if ( LANGUAGE_DONTKNOW == eTmp || LANGUAGE_NONE == eTmp || LANGUAGE_JAPANESE == eTmp ) + if ( LANGUAGE_DONTKNOW == eTmp || LANGUAGE_NONE == eTmp + || !MsLangId::usesHyphenation(eTmp) ) return false; uno::Reference< XHyphenator > xHyph = ::GetHyphenator(); |