Skip to content

Fix data race reported by ThreadSanitizer in caching pool#3897

Merged
nanangizz merged 2 commits intomasterfrom
cachingpool-datarace
Mar 26, 2024
Merged

Fix data race reported by ThreadSanitizer in caching pool#3897
nanangizz merged 2 commits intomasterfrom
cachingpool-datarace

Conversation

@nanangizz
Copy link
Copy Markdown
Member

@nanangizz nanangizz commented Mar 22, 2024

There is a report of data race:

 WARNING: ThreadSanitizer: data race (pid=13933)
  Write of size 8 at 0x726800000680 by main thread (mutexes: read M0):
    #0 cpool_on_block_alloc
    #1 default_block_alloc
    #2 pj_pool_create_block
    #3 pj_pool_allocate_find

  Previous write of size 8 at 0x726800000680 by thread T4 (mutexes: write M1, write M2):
    #0 cpool_on_block_free
    #1 default_block_free
    #2 reset_pool
    #3 pj_pool_destroy_int

After investigation, cpool_on_block_alloc() & cpool_on_block_free() are only used for updating fields cp.used_size & cp.peak_used_size which are for informational/logging purpose only. While updated without mutex protection, the information is potentially inaccurate. So I think deprecating those fields & the caching pool callbacks is safe and is able to fix the data race issue.

Thanks to Taisto Qvist for the report.

cp->factory.on_block_free = &cpool_on_block_free;

/* Deprecated, these callbacks are only used for updating cp.used_size &
* cp.peak_used_size which are no longer used.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are still used in:

  • unit tests such as pjsip-perf.c and pjsip/src/test/test.c
  • pjturn-srv/main.c
    Although mostly for logging purposes only.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right! Then I guess our options:

  • change caching pool's mutex type to recursive mutex, but it may have some impact on performance (e.g: here), affected functions: pj_pool_create/release() (less impact because less frequently used?), pj_pool_alloc()/reset() (more impact, depends on initial & growing capacity). Perhaps avoid this approach as the fields are for logging purpose.
  • leave it as is with possibility of data race, which may reduce accuracy of those fields, and trigger semi 'false alarm' on analyzing tools.
  • use pj_mutex_try_lock(), when fails keep going without lock, assuming the failure is caused by having been locked by upstream/caller, e.g: pj_pool_create/release()
  • ...?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the usage is mostly for logging and tests only, I suppose just comment/remove the code that use the deprecated fields?

@nanangizz nanangizz merged commit 478aeb9 into master Mar 26, 2024
@nanangizz nanangizz deleted the cachingpool-datarace branch March 26, 2024 08:20
dshamaev-intermedia added a commit to intermedia-net/pjproject that referenced this pull request Apr 4, 2024
* Add missing openssl SECLEVEL=0 support (pjsip#3890)

Previous SECLEVEL support allowed for levels 1-5.
However, openssl defines levels 0-5. [1]

Recent openssl versions (3.0+) have moved previous
popular ciphers/key lengths (i.e. RSA1024withSHA1)
into level 0, so it is now a reasonable choice to use.

Add support for level 0.

[1] https://www.openssl.org/docs/man3.2/man3/SSL_CTX_set_security_level.html

* Enable Late Offer Answer Mode (LOAM) feature in the pjsua (pjsip#3869)

* Fix warnings for 32-bit compiler and misc fixes. (pjsip#3896)

* Add some missing unlocks (pjsip#3893)

* Prevent race condition in DTLS media stop (pjsip#3901)

* Fix data race reported by ThreadSanitizer in caching pool (pjsip#3897)

* Fixed Metal renderer memory leak (pjsip#3909)

* Fixed DTLS clock stoppage race (pjsip#3905)

* Improve IP address change IPv4 <-> IPv6 (pjsip#3910)

---------

Co-authored-by: naf <naf@sdf.org>
Co-authored-by: Goodicus <15110766+goodicus@users.noreply.github.com>
Co-authored-by: Amilcar Ubiera <chopin952@gmail.com>
Co-authored-by: Santiago De la Cruz <51337247+xhit@users.noreply.github.com>
Co-authored-by: sauwming <ming@teluu.com>
Co-authored-by: Nanang Izzuddin <nanang@teluu.com>
Co-authored-by: dshamaev-intermedia <105777082+dshamaev-intermedia@users.noreply.github.com>
dshamaev-intermedia added a commit to intermedia-net/pjproject that referenced this pull request Jun 12, 2024
* Add missing openssl SECLEVEL=0 support (pjsip#3890)

Previous SECLEVEL support allowed for levels 1-5.
However, openssl defines levels 0-5. [1]

Recent openssl versions (3.0+) have moved previous
popular ciphers/key lengths (i.e. RSA1024withSHA1)
into level 0, so it is now a reasonable choice to use.

Add support for level 0.

[1] https://www.openssl.org/docs/man3.2/man3/SSL_CTX_set_security_level.html

* Enable Late Offer Answer Mode (LOAM) feature in the pjsua (pjsip#3869)

* Fix warnings for 32-bit compiler and misc fixes. (pjsip#3896)

* Add some missing unlocks (pjsip#3893)

* Prevent race condition in DTLS media stop (pjsip#3901)

* Fix data race reported by ThreadSanitizer in caching pool (pjsip#3897)

* Fixed Metal renderer memory leak (pjsip#3909)

* Fixed DTLS clock stoppage race (pjsip#3905)

* Improve IP address change IPv4 <-> IPv6 (pjsip#3910)

* pjsua_acc: Fix warnings for comparison between ‘pjsua_nat64_opt’ and ‘enum pjsua_ipv6_use’ (pjsip#3915)

* Fix to ext_fmts accessed out of stack scope. (pjsip#3916)

* Add check in siprtp sample app for inactive audio media (pjsip#3927)

* Add function to initialize MediaFormat audio & video (pjsip#3925)

* Fixed incorrect SDP buffer length calculation (pjsip#3924)

* Support Push Notification in iOS sample app (pjsip#3913)

* Fixed PJSUA2 API to get/set Opus config (pjsip#3935)

* Fix bad address length check in pj_ioqueue_sendto(). (pjsip#3941)

* Fix warning of uninitialized value in fuzz-crypto (pjsip#3946)

* Print log on successful send (pjsip#3942)

* Fixed CI Mac build failure (pjsip#3947)

* Update Android JNI audio dev to use 16bit PCM only (pjsip#3945)

* Add TLS/SSL backend: Windows Schannel (pjsip#3867)

* pjsip_find_msg: Log warning if Content-Length field not found (pjsip#3960)

* Fix audiodev index (pjsip#3962)

* Fix assertion on call hangup from DTMF callback (pjsip#3970)

* Fix yaml error in github feature template (pjsip#3972)

* Fix version string in Python setup (pjsip#3976)

* Prevent pjmedia_codec_param.info.enc_ptime_denum division by zero in stream (pjsip#3975)

---------

Co-authored-by: naf <naf@sdf.org>
Co-authored-by: Goodicus <15110766+goodicus@users.noreply.github.com>
Co-authored-by: Amilcar Ubiera <chopin952@gmail.com>
Co-authored-by: Santiago De la Cruz <51337247+xhit@users.noreply.github.com>
Co-authored-by: sauwming <ming@teluu.com>
Co-authored-by: Nanang Izzuddin <nanang@teluu.com>
Co-authored-by: dshamaev-intermedia <105777082+dshamaev-intermedia@users.noreply.github.com>
Co-authored-by: CI Bot <noreply@intermedia.com>
Co-authored-by: Pau Espin Pedrol <pespin.shar@gmail.com>
Co-authored-by: Riza Sulistyo <trengginas@users.noreply.github.com>
Co-authored-by: Andreas Peldszus <andreas.peldszus@posteo.de>
BarryYin pushed a commit to BarryYin/pjproject that referenced this pull request Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants