History log of /freebsd-10-stable/libexec/rtld-elf/amd64/Makefile.inc
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 290014 26-Oct-2015 vangyzen

Disable SSE in libthr

Clang emits SSE instructions on amd64 in the common path of
pthread_mutex_unlock. If the thread does not otherwise use SSE,
this usage incurs a context-switch of the FPU/SSE state, which
reduces the performance of multiple real-world applications by a
non-trivial amount (3-5% in one application).

Instead of this change, I experimented with eagerly switching the
FPU state at context-switch time. This did not help. Most of the
cost seems to be in the read/write of memory--as kib@ stated--and
not in the #NM handling. I tested on machines with and without
XSAVEOPT.

One counter-argument to this change is that most applications already
use SIMD, and the number of applications and amount of SIMD usage
are only increasing. This is absolutely true. I agree that--in
general and in principle--this change is in the wrong direction.
However, there are applications that do not use enough SSE to offset
the extra context-switch cost. SSE does not provide a clear benefit
in the current libthr code with the current compiler, but it does
provide a clear loss in some cases. Therefore, disabling SSE in
libthr is a non-loss for most, and a gain for some.

I refrained from disabling SSE in libc--as was suggested--because
I can't make the above argument for libc. It provides a wide variety
of code; each case should be analyzed separately.

https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html

Suggestions from: dim, jmg, rpaulo
Sponsored by: Dell Inc.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 217026 05-Jan-2011 dim

Sort -mno-(mmx|3dnow|sse|sse2|sse3) options consistently throughout the
tree.

Submitted by: arundel


# 216977 04-Jan-2011 dim

On amd64 and i386, tell the compiler to refrain from generating SSE,
3DNow, MMX and floating point instructions in rtld-elf.

Otherwise, _rtld_bind() (and whatever it calls) could possibly clobber
function arguments that are passed in SSE/3DNow/MMX/FP registers,
usually floating point values. This can happen, for example, when clang
generates SSE code for memset() or memcpy() calls.

One symptom of this is sshd dying early on amd64 with "PRNG not seeded",
which is ultimately caused by libcrypto.so.6 calling RAND_add() with a
double parameter. That parameter is passed via %xmm0, which gets wiped
out by an SSE memset() in _rtld_bind().

Reviewed by: kib, kan


# 216975 04-Jan-2011 dim

Remove '-elf' from build flags for libexec/rtld-elf for amd64 and i386.
ELF has been the default format for almost 12 years now.


# 211725 23-Aug-2010 imp

MFtbemd:

Prefer MACHNE_CPUARCH to MACHINE_ARCH in most contexts where you want
to test of all the CPUs of a given family conform.


# 45501 08-Apr-1999 jdp

Eliminate all machine-dependent code from the main source body and
the Makefile, and move it down into the architecture-specific
subdirectories.

Eliminate an asm() statement for the i386.

Make the dynamic linker work if it is built as an executable instead
of as a shared library. See i386/Makefile.inc to find out how to
do it. Note, this change is not enabled and it might never be
enabled. But it might be useful in the future. Building the
dynamic linker as an executable should make it start up faster,
because it won't have any relocations. But in practice I suspect
the difference is negligible.