Age | Commit message (Collapse) | Author |
|
It's been a source of numerous problems since the beginning.
Poor separation of C++ code causing the compiler to emit some generic
code as CPU-specific, compiler optimizations moving CPU-specific
code out of #ifdef to unguarded static initialization, etc.
And it doesn't seem to even particularly improve performance,
on my Ryzen2500U for one full column (1m cells) sumArray() takes
about 1.6ms with AVX, 1.9ms with SSE2 and 4.6ms with generic code.
So SSE2 code is perhaps worth it, especially given that SSE2 is our
baseline requirement on x86_64 everywhere and x86 on Windows,
but AVX+ is nowhere near worth the trouble.
So this code removes all AVX+ code from Calc, and makes SSE2
a hardcoded option on where it's guaranteed. If we raise the baseline
to AVX, the SSE2 code may be replaced by the one removed by this
commit. Generic code is there for other platforms, if other platforms
add CPU-specific code, they should preferably follow the same rules.
This does not necessarily mean that CPU-specific code cannot
be used at all. Some externals use them, for example. It just
needs to be working, maintained, and worth the trouble.
Change-Id: I5ab919930df9d0223db68a94bf84947984d313ac
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/129733
Tested-by: Jenkins
Reviewed-by: Eike Rathke <erack@redhat.com>
|
|
Otherwise the possible copy emitted compiled with CPU-specific
instructions might be chosen as the copy to keep and would be
used by generic code. See history for the Calc Kahan code.
Change-Id: Ifc1bbd8d9720d9effe05b8ff8ee5e804363939df
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/124257
Tested-by: Jenkins
Reviewed-by: Luboš Luňák <l.lunak@collabora.com>
|
|
Change-Id: I34781d98f614c1d5df97460fc2e7b59be3bb6512
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/121090
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Jenkins
|
|
This part focuses on allowing it on replacing arrayfunctor
By thefault it will try AVX512F (1,17%)
If not available will use AVX (94,77%)
Use of AVX2 (82,28%) has been avoided even if the code could been more compact
Source of hardware statistics: https://store.steampowered.com/hwsurvey
Change-Id: Iae737a565379e82c5f84f3fdee6321ac74f59d40
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/115675
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
Change-Id: Id28874549342349fb2727c3cb8e92da1dcdb727c
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/87513
Reviewed-by: Muhammet Kara <muhammet.kara@collabora.com>
Tested-by: Muhammet Kara <muhammet.kara@collabora.com>
|
|
Adds CPU intrinsics detection in configure pass for compile time
detection and "cpuid" runtime detection of which CPU instruction
sets are available on the user device.
Change-Id: I0ee4d0b22a7c51f72796d43e7383a31d03b437ad
Reviewed-on: https://gerrit.libreoffice.org/75175
Tested-by: Jenkins
Reviewed-by: Tomaž Vajngerl <quikee@gmail.com>
|
|
added hasHyperThreading() function to tools::cpuid
to detect hyperthreading.
Change-Id: I13fab4b6c649e681c329b7e3f4c9f36bda879d84
|
|
To run on ancient CPUs, we compile for Windows with -arch:SSE since
8bd6bf93b7711a7ac7c5cbd7c3bb980481570ebd (August 2014). Thus
_M_IX86_FP gets defined as 1. This meant that LO_SSE2_AVAILABLE did
not get defined, and that we hardcoded tools::cpuid::hasSSE2() as
always returning false. That was hardly the intent.
Change-Id: I7ee34510a774dab865c8990b74b91a5284218a96
|
|
Change-Id: Iee60389ccc9e348db6ed00e48e32b1e86f17b530
|
|
Prereq. to enable runtime SSE2 detection is that the compiler
supports it in the first place. MSVS and GCC use different
compiler flags for this so use __LO_SSE2_AVAILABLE__ to make this
build platform independent.
emmintrin.h is unavailable on ARM Android so include this and
compile the SSE2 specific code only when we are sure we can build
SSE2 code (__LO_SSE2_AVAILABLE__ is defined).
Change-Id: I212c4e0b99a314d087b9def822a81325b25f3469
|
|
For corner case CPUs out there that support SSE and not SSE2 it
makes more sense to use the "fallback" code path instead of
writing a SSE only version. For this reason detecting SSE is not
relevant anymore - so removing it.
Change-Id: I3f1425af2cb5cdf9fba699e2996014598a15b5c1
|
|
Change-Id: I29330061e2986ec2ae899c2f3a63d0eadd9cc194
|