[atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Adam Romanek
Hi Boosters!

I'd like to draw your attention to an issue with Boost.Atomic that I
encountered when using ThreadSanitizer. The problem is that when using
the code shown in "reference counting" example [1] from the docs, the
ThreadSanitizer reports a data race. Everything is described in detail
in a StackOverflow thread [2] so I don't want to repeat all that here.

Please help me determine whether it's a false positive, an issue in
Boost.Atomic or a wrong usage of the code.

Additionally, I don't understand this:

Write of size 1 at 0x7d040000f7f0 by thread T2:
     #0 operator delete(void*) <null>:0 (a.out+0x00000004738b)

What kind of write does ThreadSanitizer might refer to?

WBR,
Adam Romanek

[1]
http://www.boost.org/doc/libs/1_55_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_reference_counters
[2]
http://stackoverflow.com/questions/24446561/threadsanitizer-reports-data-race-on-operator-deletevoid-when-using-embedde
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Niall Douglas
On 29 Jun 2014 at 21:16, Adam Romanek wrote:

> I'd like to draw your attention to an issue with Boost.Atomic that I
> encountered when using ThreadSanitizer. The problem is that when using
> the code shown in "reference counting" example [1] from the docs, the
> ThreadSanitizer reports a data race. Everything is described in detail
> in a StackOverflow thread [2] so I don't want to repeat all that here.
>
> Please help me determine whether it's a false positive, an issue in
> Boost.Atomic or a wrong usage of the code.

I'd try replacing boost::atomic with std::atomic and see if the same
thing happens. Oh, and use a recent libstdc++. If it's only happening
in Boost, report it as a bug.

> Additionally, I don't understand this:
>
> Write of size 1 at 0x7d040000f7f0 by thread T2:
>      #0 operator delete(void*) <null>:0 (a.out+0x00000004738b)
>
> What kind of write does ThreadSanitizer might refer to?

ThreadSanitiser tracks bounds of change, so if only one byte of
change overlaps the raced bounds then it'll get reported as one byte.

tl;dr I wouldn't worry about it. ThreadSanitiser has many weirdness
due to its design and straight out bugs. I am currently awaiting
clang 3.5 release because 3.4 won't fully backtrace when libstdc++'s
reference counted std::string has false positived a race, and I have
dozens of workaround suppressions for it. Roll on libstdc++ 4.10 and
its new std::string implementation!!!

Niall

--
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users

SMime.p7s (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Adam Romanek
On 30.06.2014 19:07, Niall Douglas wrote:

> On 29 Jun 2014 at 21:16, Adam Romanek wrote:
>
>> I'd like to draw your attention to an issue with Boost.Atomic that I
>> encountered when using ThreadSanitizer. The problem is that when using
>> the code shown in "reference counting" example [1] from the docs, the
>> ThreadSanitizer reports a data race. Everything is described in detail
>> in a StackOverflow thread [2] so I don't want to repeat all that here.
>>
>> Please help me determine whether it's a false positive, an issue in
>> Boost.Atomic or a wrong usage of the code.
>
> I'd try replacing boost::atomic with std::atomic and see if the same
> thing happens. Oh, and use a recent libstdc++. If it's only happening
> in Boost, report it as a bug.
>

The same happens with std::atomic from both libstdc++ [1] and libc++ [2]
(I should be using pretty recent versions of these as I'm on Ubuntu
14.04). So this is not a Boost-only issue.

>> Additionally, I don't understand this:
>>
>> Write of size 1 at 0x7d040000f7f0 by thread T2:
>>       #0 operator delete(void*) <null>:0 (a.out+0x00000004738b)
>>
>> What kind of write does ThreadSanitizer might refer to?
>
> ThreadSanitiser tracks bounds of change, so if only one byte of
> change overlaps the raced bounds then it'll get reported as one byte.
>
> tl;dr I wouldn't worry about it. ThreadSanitiser has many weirdness
> due to its design and straight out bugs. I am currently awaiting
> clang 3.5 release because 3.4 won't fully backtrace when libstdc++'s
> reference counted std::string has false positived a race, and I have
> dozens of workaround suppressions for it. Roll on libstdc++ 4.10 and
> its new std::string implementation!!!
 >
 > Niall

I'm not that worried. I just wanted to understand this situation and
determine whether I need to correct my code somehow or I should just add
a new suppression.

Thanks,

WBR,
Adam Romanek

[1] http://pastebin.com/56TLxpSR
[2] http://pastebin.com/STPbwbTS

_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Niall Douglas
On 1 Jul 2014 at 9:12, Adam Romanek wrote:

> > I'd try replacing boost::atomic with std::atomic and see if the same
> > thing happens. Oh, and use a recent libstdc++. If it's only happening
> > in Boost, report it as a bug.
>
> The same happens with std::atomic from both libstdc++ [1] and libc++ [2]
> (I should be using pretty recent versions of these as I'm on Ubuntu
> 14.04). So this is not a Boost-only issue.

In this situation I'd report it to the clang thread sanitiser bug
tracker. The chances are low that both Boost and a recent STL are
simultaneously wrong.

Niall

--
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users

SMime.p7s (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Adam Romanek
In reply to this post by Adam Romanek
On 01.07.2014 10:49, Andrey Semashev wrote:

> On Tue, Jul 1, 2014 at 12:41 PM, Adam Romanek wrote:
>> On 01.07.2014 10:36, Andrey Semashev wrote:
>>>
>>> On Tue, Jul 1, 2014 at 12:20 PM, Andrey Semashev
>>> wrote:
>>>>
>>>> On Tue, Jul 1, 2014 at 11:20 AM, Adam Romanek
>>>> wrote:
>>>>>
>>>>> Andrey,
>>>>>
>>>>> If you don't mind, please have a look at the conversation that I started
>>>>> on
>>>>> boost-users list [1] a few days back. Since you're the maintainer of
>>>>> Boost.Atomic I'd like to know your opinion on this issue. Or maybe I
>>>>> should
>>>>> repost this on boost-dev list?
>>>>
>>>>
>>>> I don't follow users list, mostly because I can't even keep up with
>>>> the developers list. So it's better to reach me on the dev list.
>>>>
>>>> The code in the StackOverflow topic indeed contains a potential
>>>> problem. The fetch_sub(release) operation prevents previous operations
>>>> from sinking past it and atomic_thread_fence(acquire) prevents the
>>>> following operations from floating above it. However, these two
>>>> barriers themselves can be reordered creating a window where
>>>> operations before fetch_sub(release) and after
>>>> atomic_thread_fence(acquire) may overlap. I believe, this is permitted
>>>> on both the compiler and hardware levels. This doesn't mean that's
>>>> what is actually happening and being flagged by TSan; it is still
>>>> possible that some TSan bug is exposed. The fact that replacing
>>>> atomic_thread_fence(acquire) with fetch_add(acquire) speaks in favor
>>>> of the bug suspicion.
>>>
>>>
>>> Actually, no, fetch_add(acquire) is significantly different from the
>>> fence. It operates on the same object as the previous
>>> fetch_sub(release), so it must observe its effects and cannot be
>>> reordered.
>>>
>>
>> After reading your first response I thought you're suggesting that the
>> "reference counting" example in Boost.Atomic docs is broken. But hopefully
>> we agree that it's fine, right? :)
>
> The example in the docs is indeed not correct. I believe, it should be
> corrected to use acq_rel memory order in intrusive_ptr_release. The
> fence can be removed then. I'll fix it as soon as I have some time.
>
>> So I guess this is a false positive from TSan, as suggested by others on SO.
>> I'll ask this on TSan's mailing list, lets see what they say.
>
> Ok, let me know if there are other considerations from TSan devs.
>

See above for my private conversation with Andrey Semashev on this topic
(he does not follow the boost-users mailing list).

Andrey says that the "reference counting example" from Boost.Atomic docs
is broken. What do you think about this?

WBR,
Adam Romanek

_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Niall Douglas
On 1 Jul 2014 at 13:34, Adam Romanek wrote:

> See above for my private conversation with Andrey Semashev on this topic
> (he does not follow the boost-users mailing list).
>
> Andrey says that the "reference counting example" from Boost.Atomic docs
> is broken. What do you think about this?

I wouldn't be surprised. Few have access to non-Intel hardware for
testing, and all Intel CPUs always acquire loads and always release
stores. That makes use of atomics with anything but
memory_order_seq_cst superfluous on Intel at the CPU level.

Additionally, modern CPUs do very well with memory_order_seq_cst
across the board, even an ARM Cortex A15. If I ever find myself
tempted to not use memory_order_seq_cst before benchmarking I usually
slap myself hard. The only real problem with its use is that
compilers basically turn off optimisation around them, so you'd
rather not have them in tight loops if possible.

The ONLY thing which might penalise this approach in the future is
memory transactions - I can see memory_order_seq_cst would have to
abort more frequently than other orderings. But until such hardware
capability is here, I really wouldn't worry. If you see problem code
not using memory_order_seq_cst, change it to that and bypass the
hassle.

Niall

--
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/



_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users

SMime.p7s (8K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [atomic] ThreadSanitizer reports a data race when using the code from "reference counting" example

Adam Romanek
On 01.07.2014 14:15, Niall Douglas wrote:

> On 1 Jul 2014 at 13:34, Adam Romanek wrote:
>
>> See above for my private conversation with Andrey Semashev on this topic
>> (he does not follow the boost-users mailing list).
>>
>> Andrey says that the "reference counting example" from Boost.Atomic docs
>> is broken. What do you think about this?
>
> I wouldn't be surprised. Few have access to non-Intel hardware for
> testing, and all Intel CPUs always acquire loads and always release
> stores. That makes use of atomics with anything but
> memory_order_seq_cst superfluous on Intel at the CPU level.

Actually Andrey analyzed this example once again and he said he was
wrong. Let me quote him:

"After studying the standard it looks like I was wrong after all. The
paragraph 1.9/14 gives the guarantee that the compiler does not
reorder fetch_sub and the fence, so the example in the docs is
correct."

I also got confirmation from a ThreadSanitizer developer that this is a
false positive as "tsan ignores stand-alone memory fences ATM" [1].

WBR,
Adam Romanek

[1]
https://groups.google.com/d/msg/thread-sanitizer/dZ3QiQE49ns/j7rKGHg08foJ
_______________________________________________
Boost-users mailing list
[hidden email]
http://lists.boost.org/mailman/listinfo.cgi/boost-users