Discussion:
[PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time
Jonathan Wakely
2018-11-27 23:25:55 UTC
Permalink
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.

The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.

To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.

In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.

PR libstdc++/67843
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
* doc/xml/manual/configure.xml: Document new configure option.
* include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
instead of shared_ptr.
(recursive_directory_iterator): Likewise.
(__shared_ptr<_Dir>): Add explicit instantiation declaration.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
* include/bits/shared_ptr_base.h (__allocate_shared, __make_shared):
Add default template argument for _Lock_policy template parameter.
* include/ext/concurrence.h (__default_lock_policy): Check macro
_GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
target supports the builtins for compare-and-swap.
* src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
instantiation definition.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
(directory_iterator, recursive_directory_iterator): Use __make_shared
instead of make_shared.

Tested x86_64-linux and aarch64-linux, committed to trunk.

I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.
Christophe Lyon
2018-11-28 09:54:07 UTC
Permalink
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
PR libstdc++/67843
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
* doc/xml/manual/configure.xml: Document new configure option.
* include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
instead of shared_ptr.
(recursive_directory_iterator): Likewise.
(__shared_ptr<_Dir>): Add explicit instantiation declaration.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
Add default template argument for _Lock_policy template parameter.
* include/ext/concurrence.h (__default_lock_policy): Check macro
_GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
target supports the builtins for compare-and-swap.
* src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
instantiation definition.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
(directory_iterator, recursive_directory_iterator): Use __make_shared
instead of make_shared.
Tested x86_64-linux and aarch64-linux, committed to trunk.
I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.
Hi Jonathan,

Your patch passed my tests, thanks!
directory_iterator.cc tests used to be killed by a timeout until last
night where now they do PASS in my configurations involving
-march=armv5t.

That being said, I'm not sure I fully understand the fix. In my tests
involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
and I pass -march=armv5t as an override with the --target-board runtest option.

In these cases libstdc++ configure detects support for "atomic", so I
suspect the tests pass only because I use QEMU with --cpu cortex-a9.
I think if I used a different QEMU cpu (eg arm926), the tests would
try to use atomics, and fail?

The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.

However, I do have configurations using --with-cpu=arm10tdmi or
--with-cpu=default and QEMU --cpu arm926.
In these 2 cases, the tests do PASS, but they used to before your
patch. libstdc++ configure does detect "mutex" in these cases.

To be hopefully clearer, the subset of relevant configurations I use
is as follows:
GCC target / with-cpu / RUNTEST target-board / QEMU cpu
1- arm-none-linux-gnueabi[hf] / cortex-a9 / -march=armv5t / cortex-a9
2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
3- arm-none-eabi / default / "" / arm926

(1) uses atomics
(2) and (3) use mutex

All of them now say "PASS", but maybe I should give a try switch to
arm926 for QEMU in (1) ?

Thanks,

Christophe
Jonathan Wakely
2018-11-28 11:35:46 UTC
Permalink
Post by Christophe Lyon
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
PR libstdc++/67843
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
* doc/xml/manual/configure.xml: Document new configure option.
* include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
instead of shared_ptr.
(recursive_directory_iterator): Likewise.
(__shared_ptr<_Dir>): Add explicit instantiation declaration.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
Add default template argument for _Lock_policy template parameter.
* include/ext/concurrence.h (__default_lock_policy): Check macro
_GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
target supports the builtins for compare-and-swap.
* src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
instantiation definition.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
(directory_iterator, recursive_directory_iterator): Use __make_shared
instead of make_shared.
Tested x86_64-linux and aarch64-linux, committed to trunk.
I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.
Hi Jonathan,
Your patch passed my tests, thanks!
Thanks for checking it.
Post by Christophe Lyon
directory_iterator.cc tests used to be killed by a timeout until last
night where now they do PASS in my configurations involving
-march=armv5t.
That being said, I'm not sure I fully understand the fix. In my tests
involving -march=armv5t, I still configure GCC --with-cpu=cortex-a9,
and I pass -march=armv5t as an override with the --target-board runtest option.
Yup, and that difference between the arch used to build libstdc++ and
the arch used to build the tests was what caused the FAILures. And
that's what my patch addresses, by ensuring that the arch used to
build libstdc++ is what dictates the type of synchronization for
shared_ptr.
Post by Christophe Lyon
In these cases libstdc++ configure detects support for "atomic", so I
suspect the tests pass only because I use QEMU with --cpu cortex-a9.
I think if I used a different QEMU cpu (eg arm926), the tests would
try to use atomics, and fail?
I don't know if a libstdc++ configured for cortex-a9 would run on an
arm296 QEMU. It might depend on cortex-a9 instructions (unrelated to
the use of atomics in my patch). See the end of this mail.
Post by Christophe Lyon
The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.
That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That
Post by Christophe Lyon
However, I do have configurations using --with-cpu=arm10tdmi or
--with-cpu=default and QEMU --cpu arm926.
In these 2 cases, the tests do PASS, but they used to before your
patch. libstdc++ configure does detect "mutex" in these cases.
What matters is that the tests and the libstdc++ libraries agree on the
lock policy. It doesn't matter if they both use "atomic" or they both
use "mutex", only that they agree. Previously the lock policy was
selected by the preprocessor in every translation unit. That made it
very easy to get incompatible sets of objects. Now the selection is
made once per build of GCC, and is not affected by -march options when
compiling user code.
Post by Christophe Lyon
To be hopefully clearer, the subset of relevant configurations I use
GCC target / with-cpu / RUNTEST target-board / QEMU cpu
1- arm-none-linux-gnueabi[hf] / cortex-a9 / -march=armv5t / cortex-a9
2- arm-none-linux-gnueabihf / arm10tdmi / "" / arm926
3- arm-none-eabi / default / "" / arm926
(1) uses atomics
(2) and (3) use mutex
All of them now say "PASS", but maybe I should give a try switch to
arm926 for QEMU in (1) ?
That would be a useful test.

If I understand correctly, that will still work because the atomic ops
will use the kernel helper in libgcc. Previously it would have failed
the same way as (1) was already failing, because --with-cpu=cortex-a9
when libstdc++ was built chose atomics, and the -march=armv5t target
board chose mutexes, and so the testsuite files were incompatible with
the library.

Since my patch, only the --with-cpu option matters, and the testsuite
files will still use atomics. I think the arm926 cpu running under
QEMU will implement those atomics via libgcc's kernel helpers.

However, that test might fail for other reasons, if the libstdc++ libs
configured for cortex-a9 use other insns that are not supported by
arm296 then you'd get a SIGILL when running the tests. The kernel
helpers only implement the atomic operations, not any cortex-a9
instructions that are not supported by arm296.
Jonathan Wakely
2018-11-28 12:39:49 UTC
Permalink
Post by Jonathan Wakely
Post by Christophe Lyon
The reason I'm still using cortex-a9 at QEMU level is that many tests
override -mcpu/-march, and I used that as a compromise.
That means you're not fully testing armv5t, because you're still
linking to a libstdc++.so (and libstd++fs.a) that use the cortex-a9
instruction set. Usually that doesn't matter, because most libstdc++
code doesn't care about which specific instructions it gets compiled
to. That
Oops, sorry, unfinished thought. I was going to say:

That is not true for the shared_ptr code, which depends on whether
atomic instructions are supported. There are also some parts of
<random> that depend on specific instructions being supported (but
some of them do runtime checks for the CPU).
Richard Biener
2018-11-28 10:46:29 UTC
Permalink
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?

Richard.
Post by Jonathan Wakely
PR libstdc++/67843
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY): Add new macro
that defines _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_ENABLE_LOCK_POLICY.
* doc/xml/manual/configure.xml: Document new configure option.
* include/bits/fs_dir.h (directory_iterator): Use __shared_ptr
instead of shared_ptr.
(recursive_directory_iterator): Likewise.
(__shared_ptr<_Dir>): Add explicit instantiation declaration.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
Add default template argument for _Lock_policy template parameter.
* include/ext/concurrence.h (__default_lock_policy): Check macro
_GLIBCXX_HAVE_ATOMIC_LOCK_POLICY instead of checking if the current
target supports the builtins for compare-and-swap.
* src/filesystem/std-dir.cc (__shared_ptr<_Dir>): Add explicit
instantiation definition.
(__shared_ptr<recursive_directory_iterator::_Dir_stack>): Likewise.
(directory_iterator, recursive_directory_iterator): Use __make_shared
instead of make_shared.
Tested x86_64-linux and aarch64-linux, committed to trunk.
I would appreciate testing on ARM, especially Christophe's
-march=armv5t set up.
Jonathan Wakely
2018-11-28 11:11:24 UTC
Permalink
Post by Richard Biener
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!

It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).

I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Jonathan Wakely
2018-11-28 12:30:40 UTC
Permalink
Post by Jonathan Wakely
Post by Richard Biener
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!
It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).
I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.

It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).

There are a few other things we could do though:

1) As well as the __default_lock_policy value, we could add a
"library's preferred lock policy" value, which could be different from
__default_lock_policy. That policy would get used for the __shared_ptr
objects in the libstdc++ API/ABI, and would have to be consistent for
all CUs in the library and using its headers. That solves the problem
of user CUs being incompatible with the library because they used
-march. It doesn't change the fact that a libstdc++.so using "atomic"
is not compatible with one using "mutex". For this to work the library
always needs to use std::__shared_ptr<X, __library_lock_policy>
instead of std::shared_ptr<X> so that the policy is explicit (which I
think is a good rule anyway).

1a) Then (if we wanted) we could restore the old behaviour where
__default_lock_policy is chosen by the preprocessor and can vary
between user CUs. If users want to compile different CUs with
different lock policies, that's their choice. They risk creating
incompatible shared_ptr objects which cannot be passed between CUs,
but it becomes their responsibility to get it right. The library
always uses its preferred policy, and so won't get it wrong. This
seems easy for users to get wrong, but would arguably be more
backwards compatible (by still allowing users to get silent UB that
only fails at runtime).

1b) Or (if we wanted) we could still fix __default_lock_policy at
configure-time, but add a second version of the new option
--with-libstdcxx-lock-policy, so the default policy for user CUs and
the library's preferred policy can be chosen independently. You could
say you want the library to always use "mutex" but have two builds of
GCC, one where user CUs use "atomic" and one where they use "mutex".
The libstdc++.so used by those two GCC builds would be compatible, but
user CUs that use std::shared_ptr would not be.

Options 1a and 1b are mutually exclusive.

2) We could consider using __attribute__((__abi_tag__("..."))) on
std::shared_ptr when the lock policy chosen by the preprocessor
doesn't match what would have been chosen when libstdc++ was built. So
if a user compiles with a -march option that changes the lock
policy, we change the mangling of std::shared_ptr. This would still
allow incompatible std::shared_ptr objects, but they would cause
linker errors. For PR 42734 and PR 67843 that would have meant the
programs fail to link, instead of having runtime UB. The solution I
committed makes them link and run correctly, which seems better than
just saying "no, you can't do that".

3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because


Option 1) on its own seems to have no advantage over what I committed.

Option 1a) is still too easy for users to get wrong and get UB, only
now it's all their fault not my fault. That doesn't really help users.

Option 1b) makes it harder for users to get wrong (they have to mix
CUs built by different toolchains, not just built with different
-march options). I'm not sure there's much benefit to being able to
control the library policy separately from the user policy though.

Option 2) makes the failures happen earlier (when linking, not
runtime) but doesn't actually make anything work that failed
previously.

Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.
Jonathan Wakely
2018-11-28 12:49:11 UTC
Permalink
Post by Jonathan Wakely
3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because
[...]
Post by Jonathan Wakely
Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.
Another way to approach option 3) that we discussed at Cauldron, and
which I want to start a separate discussion about on ***@gcc.gnu.org,
is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al
when we have kernel helpers to implement those operations.

The shared_ptr code doesn't care if an atomic CAS comes from the CPU
or a kernel helper in libgcc. If the atomic CAS is supported via
kernel helpers, let's use it! But those predefined macros only tell
you about native CPU support (for the current target selected by
-march).


It's worth noting that all this shared_ptr code predates libatomic, so
we couldn't just say "always use atomics, and link to libatomic if
needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had
to assume no CAS *at all*. Not native, not provided by libatomic,
nothing. I'd love to simply rely on std::atomic in std::shared_ptr,
but that would be an ABI break for targets currently using a mutex,
and would add new dependencies on libatomic for some targets. (It
might also pessimize some single-threaded programs on targets that do
use atomic ref-counts, because currently some updates are non-atomic
when __gthread_active_p() == false).
Ramana Radhakrishnan
2018-11-30 11:49:09 UTC
Permalink
Firstly thanks a lot for working on this.
Post by Jonathan Wakely
Post by Jonathan Wakely
3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because
[...]
Post by Jonathan Wakely
Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.
We could change the defaults when(if) we next bump up the libstdc++ ABI
in general :-/ and if we remember to do this. I don't feel comfortable
changing the defaults silently and I don't view this as something we can
decide on by ourselves as the Arm maintainers as this really falls in
the space of folks distributing Linux using by defaults versions of the
architecture that result in the use of mutexes rather than the atomics.

It's also not clear to me if we can use any other magic tricks to make
this co-exist and whether that is worth it.
Post by Jonathan Wakely
Another way to approach option 3) that we discussed at Cauldron, and
is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al
when we have kernel helpers to implement those operations.
The shared_ptr code doesn't care if an atomic CAS comes from the CPU
or a kernel helper in libgcc. If the atomic CAS is supported via
kernel helpers, let's use it! But those predefined macros only tell
you about native CPU support (for the current target selected by
-march).
It's worth noting that all this shared_ptr code predates libatomic, so
we couldn't just say "always use atomics, and link to libatomic if
needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had
to assume no CAS *at all*. Not native, not provided by libatomic,
nothing. I'd love to simply rely on std::atomic in std::shared_ptr,
but that would be an ABI break for targets currently using a mutex,
and would add new dependencies on libatomic for some targets. (It
might also pessimize some single-threaded programs on targets that do
use atomic ref-counts, because currently some updates are non-atomic
when __gthread_active_p() == false).
Yep, though you want it to go to the kernel helpers or indeed libatomic.

regards
Ramana
Richard Biener
2018-11-28 13:46:58 UTC
Permalink
Post by Jonathan Wakely
Post by Jonathan Wakely
Post by Richard Biener
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!
It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).
I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.
It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).
I think the default should be part of the platform ABI somehow,
like the i386 incompatibility should better not arise. I suppose
libstdc++ configury doesn't read in gcc/config.gcc but does it
have sth similar where it's easy to see defaults for plaforms?
There's configure.host but I'm not sure this is related at all.

Richard.
Post by Jonathan Wakely
1) As well as the __default_lock_policy value, we could add a
"library's preferred lock policy" value, which could be different from
__default_lock_policy. That policy would get used for the __shared_ptr
objects in the libstdc++ API/ABI, and would have to be consistent for
all CUs in the library and using its headers. That solves the problem
of user CUs being incompatible with the library because they used
-march. It doesn't change the fact that a libstdc++.so using "atomic"
is not compatible with one using "mutex". For this to work the library
always needs to use std::__shared_ptr<X, __library_lock_policy>
instead of std::shared_ptr<X> so that the policy is explicit (which I
think is a good rule anyway).
1a) Then (if we wanted) we could restore the old behaviour where
__default_lock_policy is chosen by the preprocessor and can vary
between user CUs. If users want to compile different CUs with
different lock policies, that's their choice. They risk creating
incompatible shared_ptr objects which cannot be passed between CUs,
but it becomes their responsibility to get it right. The library
always uses its preferred policy, and so won't get it wrong. This
seems easy for users to get wrong, but would arguably be more
backwards compatible (by still allowing users to get silent UB that
only fails at runtime).
1b) Or (if we wanted) we could still fix __default_lock_policy at
configure-time, but add a second version of the new option
--with-libstdcxx-lock-policy, so the default policy for user CUs and
the library's preferred policy can be chosen independently. You could
say you want the library to always use "mutex" but have two builds of
GCC, one where user CUs use "atomic" and one where they use "mutex".
The libstdc++.so used by those two GCC builds would be compatible, but
user CUs that use std::shared_ptr would not be.
Options 1a and 1b are mutually exclusive.
2) We could consider using __attribute__((__abi_tag__("..."))) on
std::shared_ptr when the lock policy chosen by the preprocessor
doesn't match what would have been chosen when libstdc++ was built. So
if a user compiles with a -march option that changes the lock
policy, we change the mangling of std::shared_ptr. This would still
allow incompatible std::shared_ptr objects, but they would cause
linker errors. For PR 42734 and PR 67843 that would have meant the
programs fail to link, instead of having runtime UB. The solution I
committed makes them link and run correctly, which seems better than
just saying "no, you can't do that".
3) We could change libstdc++'s configure to automatically override
--with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
(because we know the kernel helpers are available). That causes an ABI
change for a GCC built for --target=armv5t-*-linux* so I didn't do
that by default. Packagers who want that can use the --with option
explicitly to build a libstdc++ with atomic refcounting that can be
used on any armv5t or later CPU, allowing users to choose a newer
-march for their own code. (Until my patch that didn't work, because
Option 1) on its own seems to have no advantage over what I committed.
Option 1a) is still too easy for users to get wrong and get UB, only
now it's all their fault not my fault. That doesn't really help users.
Option 1b) makes it harder for users to get wrong (they have to mix
CUs built by different toolchains, not just built with different
-march options). I'm not sure there's much benefit to being able to
control the library policy separately from the user policy though.
Option 2) makes the failures happen earlier (when linking, not
runtime) but doesn't actually make anything work that failed
previously.
Option 3) is not my call to make. If the ARM maintainers want to
change the default older arm-linux targets I have no objections.
Jonathan Wakely
2018-11-28 14:09:30 UTC
Permalink
Post by Richard Biener
Post by Jonathan Wakely
Post by Jonathan Wakely
Post by Richard Biener
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!
It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).
I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.
It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).
I think the default should be part of the platform ABI somehow,
like the i386 incompatibility should better not arise.
What do you suggest to ensure that?

Just imake i386 default to the "atomic" lock policy, and require anybody
silly enough to configure for i386 to either link to libatomic, or
explicitly use --with-libstdcxx-lock-policy=mutex if they really want
to be incompatible with default i486 builds?
Post by Richard Biener
I suppose
libstdc++ configury doesn't read in gcc/config.gcc but does it
have sth similar where it's easy to see defaults for plaforms?
There's configure.host but I'm not sure this is related at all.
No, I don't think we do. When there are target-specific things
hardcoded it's spread out around acinclude.m4 and configure.ac
Richard Biener
2018-11-29 13:51:11 UTC
Permalink
Post by Jonathan Wakely
Post by Richard Biener
Post by Jonathan Wakely
Post by Jonathan Wakely
Post by Richard Biener
Post by Jonathan Wakely
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI changes. For
example, objects compiled with -march=armv7 will use atomics to
synchronize reference counts, and objects compiled with -march=armv5t
will use a mutex. That means the shared_ptr control block will have a
different layout in different objects, causing ODR violations and
undefined behaviour. This was the root cause of PR libstdc++/42734 as
well as PR libstdc++/67843.
The solution is to decide on the lock policy at build time, when
libstdc++ is configured. The configure script checks for the
availability of the necessary atomic built-ins for the target and fixes
that choice permanently. Different -march flags used to compile user
code will not cause changes to the lock policy. This results in an ABI
change for certain compilations, but only where there was already an ABI
incompatibility between the libstdc++.so library and objects built with
an incompatible -march option. In general, this means a more stable ABI
that isn't silently altered when -march flags make addition atomic ops
available.
To force a target to use "atomic" or "mutex" the new configure option
--with-libstdcxx-lock-policy can be used.
In order to turn ODR violations into linker errors, the uses of
shared_ptr in filesystem directory iterators have been replaced
with __shared_ptr, and explicit instantiations are declared. This
ensures that object files using those types cannot link to libstdc++
libs unless they use the same lock policy.
Would it be possible to have both in libstdc++ and with differnet mangling of
different kind ensure compatibility between different user CUs? Or is
that too awkward for the user to get right?
It would mean duplicating a lot more code, which is already duplicated
once for the std::string ABI, so we'd have four permuations!
It still wouldn't ensure compatibility between different user CUs,
only between any user CU and any build of libstdc++. Different user
CUs would still disagree on the ABI of the types, and so couldn't pass
them between CUs. I see no advantage to supporting that for the
std::filesystem library (unlike std::string and std::iostream, which
are probably used in the majority of CUs).
I do not want to get to the point where every type in libstdc++ exists
multiple times and you select some combination via command-line flags.
It's already becoming unmanageable with multiple std::string and long
double ABIs.
Also, in practice, I don't see a need. The common cases where this bug
arises are limited to 32-bit ARM, and for arm-linux the kernel helpers
mean that you can always use "atomic". For bare metal ARM toolchains
you probably don't want to be mixing CUs built against different
versions of libstdc++ anyway. You have your toolchain for the board,
and you use that. If it is configured to use "atomic", then that's
what you get. If it's configured to use "mutex", then that's what you
get. You probably don't want to use a toolchain configured for a
different board, but you could use --with-libstdcxx-lock-policy to
ensure a consistent policy across those toolchains if needed.
It's also possible for problems to arise on i386. If you build GCC for
i386 then it will choose "mutex" and be incompatible with a libstdc++
built for i486 or later. In that case, you could use the option
--with-libstdcxx-lock-policy=atomic to force the use of "atomic" (and
then you'd need to link to libatomic to provide the ops, as there are
no kernel helpers for i386 in libgcc).
I think the default should be part of the platform ABI somehow,
like the i386 incompatibility should better not arise.
What do you suggest to ensure that?
Just imake i386 default to the "atomic" lock policy, and require anybody
silly enough to configure for i386 to either link to libatomic, or
explicitly use --with-libstdcxx-lock-policy=mutex if they really want
to be incompatible with default i486 builds?
Something like that, yes.
Post by Jonathan Wakely
Post by Richard Biener
I suppose
libstdc++ configury doesn't read in gcc/config.gcc but does it
have sth similar where it's easy to see defaults for plaforms?
There's configure.host but I'm not sure this is related at all.
No, I don't think we do. When there are target-specific things
hardcoded it's spread out around acinclude.m4 and configure.ac
Hans-Peter Nilsson
2018-11-28 23:08:36 UTC
Permalink
Date: Tue, 27 Nov 2018 23:25:55 +0000
This resolves a longstanding issue where the lock policy for shared_ptr
reference counting depends on compilation options when the header is
included, so that different -march options can cause ABI
changes.
[...]
Thank you for pushing through and bringing this to a closure!

brgds, H-P
Loading...