History log of /freebsd-10-stable/lib/libthr/arch/amd64/Makefile.inc
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 290014 26-Oct-2015 vangyzen

Disable SSE in libthr

Clang emits SSE instructions on amd64 in the common path of
pthread_mutex_unlock. If the thread does not otherwise use SSE,
this usage incurs a context-switch of the FPU/SSE state, which
reduces the performance of multiple real-world applications by a
non-trivial amount (3-5% in one application).

Instead of this change, I experimented with eagerly switching the
FPU state at context-switch time. This did not help. Most of the
cost seems to be in the read/write of memory--as kib@ stated--and
not in the #NM handling. I tested on machines with and without
XSAVEOPT.

One counter-argument to this change is that most applications already
use SIMD, and the number of applications and amount of SIMD usage
are only increasing. This is absolutely true. I agree that--in
general and in principle--this change is in the wrong direction.
However, there are applications that do not use enough SSE to offset
the extra context-switch cost. SSE does not provide a clear benefit
in the current libthr code with the current compiler, but it does
provide a clear loss in some cases. Therefore, disabling SSE in
libthr is a non-loss for most, and a gain for some.

I refrained from disabling SSE in libc--as was suggested--because
I can't make the above argument for libc. It provides a wide variety
of code; each case should be analyzed separately.

https://lists.freebsd.org/pipermail/freebsd-current/2015-March/055193.html

Suggestions from: dim, jmg, rpaulo
Sponsored by: Dell Inc.


# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 212516 12-Sep-2010 imp

Merge from tbemd, with a small amount of rework:
For all libthr contexts, use ${MACHINE_CPUARCH}
for all libc contexts, use ${MACHINE_ARCH} if it exists, otherwise use
${MACHINE_CPUARCH}
Move some common code up a layer (the .PATH statement was the same in
all the arch submakefiles).

# Hope she hasn't busted powerpc64 with this...


# 177853 02-Apr-2008 davidxu

Replace function _umtx_op with _umtx_op_err, the later function directly
returns errno, because errno can be mucked by user's signal handler and
most of pthread api heavily depends on errno to be correct, this change
should improve stability of the thread library.


# 176226 13-Feb-2008 obrien

style.Makefile(5)


# 144518 01-Apr-2005 davidxu

Import my recent 1:1 threading working. some features improved includes:
1. fast simple type mutex.
2. __thread tls works.
3. asynchronous cancellation works ( using signal ).
4. thread synchronization is fully based on umtx, mainly, condition
variable and other synchronization objects were rewritten by using
umtx directly. those objects can be shared between processes via
shared memory, it has to change ABI which does not happen yet.
5. default stack size is increased to 1M on 32 bits platform, 2M for
64 bits platform.
As the result, some mysql super-smack benchmarks show performance is
improved massivly.

Okayed by: jeff, mtm, rwatson, scottl


# 134050 19-Aug-2004 davidxu

Add AMD64 support code.