Skip to content

Fix memory leak in pyext when Selectable is returned to Python#343

Merged
lguohan merged 2 commits intosonic-net:masterfrom
pavel-shirshov:pavelsh/fix_memory_leak
May 27, 2020
Merged

Fix memory leak in pyext when Selectable is returned to Python#343
lguohan merged 2 commits intosonic-net:masterfrom
pavel-shirshov:pavelsh/fix_memory_leak

Conversation

@pavel-shirshov
Copy link
Contributor

@pavel-shirshov pavel-shirshov commented May 22, 2020

We have a memory leak in pyext. As result ledd, lddpd, and pmon could use a lot of memory.
I was able to extract the memory leak patter in the following testcase:
You need to have pympler package installed.

# server.py
from swsscommon import swsscommon
from pympler.tracker import SummaryTracker

def main():
  tracker = SummaryTracker()
  appl_db = swsscommon.DBConnector(0, "localhost", 6379, 0)
  sel = swsscommon.Select()
  sst = swsscommon.SubscriberStateTable(appl_db, "TABLE")
  sel.addSelectable(sst)
  for _ in xrange(10000):
    state, c = sel.select(1000)
  tracker.print_diff()

main()

With the following client code
client.py

from swsscommon import swsscommon


app_db = swsscommon.DBConnector(swsscommon.APPL_DB, "localhost", 6379, 0)
t = swsscommon.Table(app_db, "TABLE")
def z(t, st):
  fvs = swsscommon.FieldValuePairs([("status", st)])
  t.set("123", fvs)

for i in xrange(10000000000000):
  z(t, "up")
  z(t, "down")

You can see the following output from the server.py:

                                       types |   # objects |   total size
============================================ | =========== | ============
                                        dict |       10006 |      2.67 MB
            swsscommon.swsscommon.Selectable |       10000 |    625.00 KB
                                        list |        1270 |    128.94 KB
                                         str |        1270 |     71.11 KB
                                         int |         147 |      3.45 KB
                                        type |           1 |    872     B
                                       tuple |           5 |    360     B
                          wrapper_descriptor |           4 |    320     B
                           member_descriptor |           2 |    144     B
                                        code |           1 |    128     B
                       function (store_info) |           1 |    120     B
                                        cell |           2 |    112     B
                                     weakref |           1 |     88     B
                           getset_descriptor |           1 |     72     B
  swsscommon.swsscommon.SubscriberStateTable |           1 |     64     B

You can see from the output that 10000 dicts and swsscommon.swsscommon.Selectable were leaked. It is the same as number of iterations in the select loop.

With the fix the output is the following:

                                       types |   # objects |   total size
============================================ | =========== | ============
                                        list |        1270 |    128.94 KB
                                         str |        1270 |     71.11 KB
                                         int |         147 |      3.45 KB
                                        dict |           7 |      2.66 KB
                                        type |           1 |    872     B
                                       tuple |           5 |    360     B
                          wrapper_descriptor |           4 |    320     B
                           member_descriptor |           2 |    144     B
                                        code |           1 |    128     B
                       function (store_info) |           1 |    120     B
                                        cell |           2 |    112     B
                                     weakref |           1 |     88     B
                           getset_descriptor |           1 |     72     B
  swsscommon.swsscommon.SubscriberStateTable |           1 |     64     B
            swsscommon.swsscommon.Selectable |           1 |     64     B

The test is added. It requires sonic-net/sonic-buildimage#4652
The test result if we have memory leak

root@05ad98abefdd:/src/sonic-swss-common# cat check.txt
============================= test session starts ==============================
platform linux2 -- Python 2.7.13, pytest-3.0.6, py-1.4.32, pluggy-0.4.0 -- /usr/bin/python
cachedir: .cache
rootdir: /src/sonic-swss-common, inifile:
collecting ... collected 7 items

tests/test_redis_ut.py::test_ProducerTable PASSED
tests/test_redis_ut.py::test_ProducerStateTable PASSED
tests/test_redis_ut.py::test_Table PASSED
tests/test_redis_ut.py::test_SubscriberStateTable PASSED
tests/test_redis_ut.py::test_Notification PASSED
tests/test_redis_ut.py::test_DBConnectorRedisClientName PASSED
tests/test_redis_ut.py::test_SelectMemoryLeak FAILED

=================================== FAILURES ===================================
____________________________ test_SelectMemoryLeak _____________________________

    def test_SelectMemoryLeak():
        N = 50000
        def table_set(t, state):
            fvs = swsscommon.FieldValuePairs([("status", state)])
            t.set("123", fvs)

        def generator_SelectMemoryLeak():
            app_db = swsscommon.DBConnector("APPL_DB", 0, True)
            t = swsscommon.Table(app_db, "TABLE")
            for i in xrange(N/2):
                table_set(t, "up")
                table_set(t, "down")

        tracker = SummaryTracker()
        appl_db = swsscommon.DBConnector("APPL_DB", 0, True)
        sel = swsscommon.Select()
        sst = swsscommon.SubscriberStateTable(appl_db, "TABLE")
        sel.addSelectable(sst)
        thr = Thread(target=generator_SelectMemoryLeak)
        thr.daemon = True
        thr.start()
        time.sleep(5)
        for _ in xrange(N):
            state, c = sel.select(1000)
        diff = tracker.diff()
        cases = []
        for name, count, _ in diff:
            if count >= N:
                cases.append("%s - %d objects for %d repeats" % (name, count, N))
        thr.join()
>       assert not cases
E       assert not ['swsscommon.swsscommon.Selectable - 50000 objects for 50000 repeats', 'dict - 50009 objects for 50000 repeats']

tests/test_redis_ut.py:124: AssertionError
===================== 1 failed, 6 passed in 13.43 seconds ======================

qiluo-msft
qiluo-msft previously approved these changes May 23, 2020
}
temp = SWIG_NewPointerObj(*$1, SWIGTYPE_p_swss__Selectable, SWIG_POINTER_OWN);
PyList_Append($result, temp);
temp = SWIG_NewPointerObj(*$1, SWIGTYPE_p_swss__Selectable, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fix this memory leak issue. can you describe the nature of this fix? what causes the memory leak?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First line I just changed SWIG_POINTER_OWN to 0, because we don't own the pointer. (in my logic). We just return the pointer which was in the subribers list.
http://www.swig.org/Doc3.0/SWIGDocumentation.html#Chicken_collection
So we don't want to garbage collect the pointer, which still in the subscribers list.
The second change. I found it in the multiple examples.
https://github.com/swig/swig/blob/rel-3.0.11/Lib/python/pyrun.swg#L120

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by "in my logic“? so, the first one is addressing the bug. I assume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this thread. https://sourceforge.net/p/swig/mailman/message/27419191/

it seems to me the root cause of memory is not SWIG_POINTER_OWN, but PyList_Append. I can see that https://github.com/swig/swig/blob/rel-3.0.11/Lib/python/pyrun.swg#L134 did the refcount decrement.

can you keep SWIG_POINTER_OWN, and only change to SWIG_Python_AppendOutput, and check if it solves the memory leak?

I looked around the SWIG_POINTER_OWN, do not have much luck in understanding its usage, seems related to finalizer.

Copy link
Contributor Author

@pavel-shirshov pavel-shirshov May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan
This typemap

%typemap(argout) swss::Selectable ** {
    PyObject* temp = NULL;
    if (!PyList_Check($result)) {
        temp = $result;
        $result = PyList_New(1);
        PyList_SetItem($result, 0, temp);
    }
//    temp = SWIG_NewPointerObj(*$1, SWIGTYPE_p_swss__Selectable, SWIG_POINTER_OWN);
//    PyList_Append($result, temp);
//    temp = SWIG_NewPointerObj(*$1, SWIGTYPE_p_swss__Selectable, 0);
//    SWIG_Python_AppendOutput($result, temp);
    temp = SWIG_NewPointerObj(*$1, SWIGTYPE_p_swss__Selectable, SWIG_POINTER_OWN);
    SWIG_Python_AppendOutput($result, temp);
}

gives me a crash. As I think because python garbage collects returned Selectable. which we have only one:

root@05ad98abefdd:/src/sonic-swss-common# pytest -s -v tests
================================================================== test session starts ==================================================================
platform linux2 -- Python 2.7.13, pytest-3.0.6, py-1.4.32, pluggy-0.4.0 -- /usr/bin/python
cachedir: .cache
rootdir: /src/sonic-swss-common, inifile:
collected 7 items

tests/test_redis_ut.py::test_ProducerTable PASSED
tests/test_redis_ut.py::test_ProducerStateTable PASSED
tests/test_redis_ut.py::test_Table PASSED
tests/test_redis_ut.py::test_SubscriberStateTable *** Error in `/usr/bin/python': free(): invalid pointer: 0x000055cae23be25f ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7ffbb2441bfb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fc6)[0x7ffbb2447fc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7780e)[0x7ffbb244880e]
/usr/lib/x86_64-linux-gnu/libhiredis.so.0.14(redisFree+0x24)[0x7ffbb087a264]
/usr/lib/x86_64-linux-gnu/libswsscommon.so.0(_ZN4swss11DBConnectorD1Ev+0x10)[0x7ffbb02f6140]
/usr/lib/x86_64-linux-gnu/libswsscommon.so.0(_ZN4swss11RedisSelectD0Ev+0x28)[0x7ffbb031cf08]
/usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so(+0x2ac60)[0x7ffbb0573c60]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyObject_CallFunctionObjArgs+0x169)[0x55cadec4f799]
/usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so(+0x15100)[0x7ffbb055e100]
/usr/bin/python(+0x1351b6)[0x55cadec941b6]
/usr/bin/python(PyEval_EvalCodeEx+0x60c)[0x55cadec58dcc]
/usr/bin/python(+0x116778)[0x55cadec75778]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x7170)[0x55cadec613a0]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(+0x1165be)[0x55cadec755be]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x7170)[0x55cadec613a0]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(+0x1165be)[0x55cadec755be]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(+0x12ce1e)[0x55cadec8be1e]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_CallObjectWithKeywords+0x30)[0x55cadec64a10]
/usr/bin/python(PyInstance_New+0x7c)[0x55cadec789ec]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x612f)[0x55cadec6035f]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(PyEval_EvalFrameEx+0x6948)[0x55cadec60b78]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(+0x116778)[0x55cadec75778]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(+0x12ce1e)[0x55cadec8be1e]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(+0x18b827)[0x55cadecea827]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x612f)[0x55cadec6035f]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(+0x1165be)[0x55cadec755be]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x7170)[0x55cadec613a0]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(+0x1165be)[0x55cadec755be]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(+0x12ce1e)[0x55cadec8be1e]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_CallObjectWithKeywords+0x30)[0x55cadec64a10]
/usr/bin/python(PyInstance_New+0x7c)[0x55cadec789ec]
/usr/bin/python(PyObject_Call+0x43)[0x55cadec470c3]
/usr/bin/python(PyEval_EvalFrameEx+0x612f)[0x55cadec6035f]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
/usr/bin/python(PyEval_EvalFrameEx+0x6948)[0x55cadec60b78]
/usr/bin/python(PyEval_EvalFrameEx+0x5f1f)[0x55cadec6014f]
/usr/bin/python(PyEval_EvalCodeEx+0x235)[0x55cadec589f5]
======= Memory map: ========
55cadeb5f000-55cadee81000 r-xp 00000000 fd:00 2285114                    /usr/bin/python2.7
55cadf081000-55cadf083000 r--p 00322000 fd:00 2285114                    /usr/bin/python2.7
55cadf083000-55cadf0fa000 rw-p 00324000 fd:00 2285114                    /usr/bin/python2.7
55cadf0fa000-55cadf11d000 rw-p 00000000 00:00 0
55cae05a4000-55cae2bab000 rw-p 00000000 00:00 0                          [heap]
7ffba8000000-7ffba8021000 rw-p 00000000 00:00 0
7ffba8021000-7ffbac000000 ---p 00000000 00:00 0
7ffbaeedf000-7ffbaeef5000 r-xp 00000000 fd:00 2246245                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffbaeef5000-7ffbaf0f4000 ---p 00016000 fd:00 2246245                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffbaf0f4000-7ffbaf0f5000 r--p 00015000 fd:00 2246245                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffbaf0f5000-7ffbaf0f6000 rw-p 00016000 fd:00 2246245                    /lib/x86_64-linux-gnu/libgcc_s.so.1
7ffbaf0f6000-7ffbaf268000 r-xp 00000000 fd:00 2247260                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7ffbaf268000-7ffbaf468000 ---p 00172000 fd:00 2247260                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7ffbaf468000-7ffbaf472000 r--p 00172000 fd:00 2247260                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7ffbaf472000-7ffbaf474000 rw-p 0017c000 fd:00 2247260                    /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
7ffbaf474000-7ffbaf478000 rw-p 00000000 00:00 0
7ffbaf478000-7ffbaf498000 r-xp 00000000 fd:00 2267185                    /lib/x86_64-linux-gnu/libnl-3.so.200.26.0
7ffbaf498000-7ffbaf697000 ---p 00020000 fd:00 2267185                    /lib/x86_64-linux-gnu/libnl-3.so.200.26.0
7ffbaf697000-7ffbaf699000 r--p 0001f000 fd:00 2267185                    /lib/x86_64-linux-gnu/libnl-3.so.200.26.0
7ffbaf699000-7ffbaf69a000 rw-p 00021000 fd:00 2267185                    /lib/x86_64-linux-gnu/libnl-3.so.200.26.0
7ffbaf69a000-7ffbaf715000 r-xp 00000000 fd:00 2267590                    /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0
7ffbaf715000-7ffbaf914000 ---p 0007b000 fd:00 2267590                    /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0
7ffbaf914000-7ffbaf917000 r--p 0007a000 fd:00 2267590                    /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0
7ffbaf917000-7ffbaf91c000 rw-p 0007d000 fd:00 2267590                    /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200.26.0
7ffbaf91c000-7ffbaf91f000 rw-p 00000000 00:00 0
7ffbaf91f000-7ffbaf935000 r-xp 00000000 fd:00 2267464                    /usr/lib/x86_64-linux-gnu/libnl-nf-3.so.200.26.0
7ffbaf935000-7ffbafb35000 ---p 00016000 fd:00 2267464                    /usr/lib/x86_64-linux-gnu/libnl-nf-3.so.200.26.0
7ffbafb35000-7ffbafb36000 r--p 00016000 fd:00 2267464                    /usr/lib/x86_64-linux-gnu/libnl-nf-3.so.200.26.0
7ffbafb36000-7ffbafb38000 rw-p 00017000 fd:00 2267464                    /usr/lib/x86_64-linux-gnu/libnl-nf-3.so.200.26.0
7ffbafb38000-7ffbafb3d000 r-xp 00000000 fd:00 2267391                    /lib/x86_64-linux-gnu/libnl-genl-3.so.200.26.0
7ffbafb3d000-7ffbafd3d000 ---p 00005000 fd:00 2267391                    /lib/x86_64-linux-gnu/libnl-genl-3.so.200.26.0
7ffbafd3d000-7ffbafd3e000 r--p 00005000 fd:00 2267391                    /lib/x86_64-linux-gnu/libnl-genl-3.so.200.26.0
7ffbafd3e000-7ffbafd3f000 rw-p 00006000 fd:00 2267391                    /lib/x86_64-linux-gnu/libnl-genl-3.so.200.26.0
7ffbafd3f000-7ffbb0034000 r-xp 00000000 fd:00 2290456                    /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
7ffbb0034000-7ffbb0234000 ---p 002f5000 fd:00 2290456                    /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
7ffbb0234000-7ffbb0236000 r--p 002f5000 fd:00 2290456                    /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
7ffbb0236000-7ffbb02ad000 rw-p 002f7000 fd:00 2290456                    /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
7ffbb02ad000-7ffbb02d1000 rw-p 00000000 00:00 0
7ffbb02d1000-7ffbb0346000 r-xp 00000000 fd:00 2267558                    /usr/lib/x86_64-linux-gnu/libswsscommon.so.0.0.0
7ffbb0346000-7ffbb0545000 ---p 00075000 fd:00 2267558                    /usr/lib/x86_64-linux-gnu/libswsscommon.so.0.0.0
7ffbb0545000-7ffbb0547000 r--p 00074000 fd:00 2267558                    /usr/lib/x86_64-linux-gnu/libswsscommon.so.0.0.0
7ffbb0547000-7ffbb0548000 rw-p 00076000 fd:00 2267558                    /usr/lib/x86_64-linux-gnu/libswsscommon.so.0.0.0
7ffbb0548000-7ffbb0549000 rw-p 00000000 00:00 0
7ffbb0549000-7ffbb05be000 r-xp 00000000 fd:00 2290849                    /usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so.0.0.0
7ffbb05be000-7ffbb07bd000 ---p 00075000 fd:00 2290849                    /usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so.0.0.0
7ffbb07bd000-7ffbb07be000 r--p 00074000 fd:00 2290849                    /usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so.0.0.0
7ffbb07be000-7ffbb07c5000 rw-p 00075000 fd:00 2290849                    /usr/lib/python2.7/dist-packages/swsscommon/_swsscommon.so.0.0.0
7ffbb07c5000-7ffbb0845000 rw-p 00000000 00:00 0
7ffbb0875000-7ffbb0884000 r-xp 00000000 fd:00 2267112                    /usr/lib/x86_64-linux-gnu/libhiredis.so.0.14
7ffbb0884000-7ffbb0a83000 ---p 0000f000 fd:00 2267112                    /usr/lib/x86_64-linux-gnu/libhiredis.so.0.14
7ffbb0a83000-7ffbb0a84000 r--p 0000e000 fd:00 2267112                    /usr/lib/x86_64-linux-gnu/libhiredis.so.0.14
7ffbb0a84000-7ffbb0a85000 rw-p 0000f000 fd:00 2267112                    /usr/lib/x86_64-linux-gnu/libhiredis.so.0.14
7ffbb0a85000-7ffbb0a95000 r-xp 00000000 fd:00 2285365                    /usr/lib/python2.7/lib-dynload/_testcapi.x86_64-linux-gnu.so
7ffbb0a95000-7ffbb0c94000 ---p 00010000 fd:00 2285365                    /usr/lib/python2.7/lib-dynload/_testcapi.x86_64-linux-gnu.so
7ffbb0c94000-7ffbb0c95000 r--p 0000f000 fd:00 2285365                    /usr/lib/python2.7/lib-dynload/_testcapi.x86_64-linux-gnu.so
7ffbb0c95000-7ffbb0c97000 rw-p 00010000 fd:00 2285365                    /usr/lib/python2.7/lib-dynload/_testcapi.x86_64-linux-gnu.so
7ffbb0c97000-7ffbb0d97000 rw-p 00000000 00:00 0
7ffbb0d97000-7ffbb0d9b000 r-xp 00000000 fd:00 2285380                    /usr/lib/python2.7/lib-dynload/termios.x86_64-linux-gnu.so
7ffbb0d9b000-7ffbb0f9a000 ---p 00004000 fd:00 2285380                    /usr/lib/python2.7/lib-dynload/termios.x86_64-linux-gnu.so
7ffbb0f9a000-7ffbb0f9b000 r--p 00003000 fd:00 2285380                    /usr/lib/python2.7/lib-dynload/termios.x86_64-linux-gnu.so
7ffbb0f9b000-7ffbb0f9d000 rw-p 00004000 fd:00 2285380                    /usr/lib/python2.7/lib-dynload/termios.x86_64-linux-gnu.so
7ffbb0f9d000-7ffbb0fdd000 rw-p 00000000 00:00 0
7ffbb0fdd000-7ffbb0fed000 r-xp 00000000 fd:00 2285359                    /usr/lib/python2.7/lib-dynload/_json.x86_64-linux-gnu.so
7ffbb0fed000-7ffbb11ec000 ---p 00010000 fd:00 2285359                    /usr/lib/python2.7/lib-dynload/_json.x86_64-linux-gnu.so
7ffbb11ec000-7ffbb11ed000 r--p 0000f000 fd:00 2285359                    /usr/lib/python2.7/lib-dynload/_json.x86_64-linux-gnu.so
7ffbb11ed000-7ffbb11ee000 rw-p 00010000 fd:00 2285359                    /usr/lib/python2.7/lib-dynload/_json.x86_64-linux-gnu.so
7ffbb11ee000-7ffbb15ae000 rw-p 00000000 00:00 0
7ffbb15e8000-7ffbb1768000 rw-p 00000000 00:00 0
7ffbb1768000-7ffbb177f000 r-xp 00000000 fd:00 2285364                    /usr/lib/python2.7/lib-dynload/_ssl.x86_64-linux-gnu.so
7ffbb177f000-7ffbb197e000 ---p 00017000 fd:00 2285364                    /usr/lib/python2.7/lib-dynload/_ssl.x86_64-linux-gnu.so
7ffbb197e000-7ffbb197f000 r--p 00016000 fd:00 2285364                    /usr/lib/python2.7/lib-dynload/_ssl.x86_64-linux-gnu.so
7ffbb197f000-7ffbb1983000 rw-p 00017000 fd:00 2285364                    /usr/lib/python2.7/lib-dynload/_ssl.x86_64-linux-gnu.so
7ffbb1983000-7ffbb1bef000 r-xp 00000000 fd:00 2255577                    /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
7ffbb1bef000-7ffbb1dee000 ---p 0026c000 fd:00 2255577                    /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
7ffbb1dee000-7ffbb1e0c000 r--p 0026b000 fd:00 2255577                    /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
7ffbb1e0c000-7ffbb1e1a000 rw-p 00289000 fd:00 2255577                    /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
7ffbb1e1a000-7ffbb1e1d000 rw-p 00000000 00:00 0
7ffbb1e1d000-7ffbb1e80000 r-xp 00000000 fd:00 2255578                    /usr/lib/x86_64-linux-gnu/libssl.so.1.1
7ffbb1e80000-7ffbb207f000 ---p 00063000 fd:00 2255578                    /usr/lib/x86_64-linux-gnu/libssl.so.1.1
7ffbb207f000-7ffbb2083000 r--p 00062000 fd:00 2255578                    /usr/lib/x86_64-linux-gnu/libssl.so.1.1
7ffbb2083000-7ffbb2089000 rw-p 00066000 fd:00 2255578                    /usr/lib/x86_64-linux-gnu/libssl.so.1.1
7ffbb2089000-7ffbb208e000 r-xp 00000000 fd:00 2285357                    /usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so
7ffbb208e000-7ffbb228e000 ---p 00005000 fd:00 2285357                    /usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so
7ffbb228e000-7ffbb228f000 r--p 00005000 fd:00 2285357                    /usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so
7ffbb228f000-7ffbb2290000 rw-p 00006000 fd:00 2285357                    /usr/lib/python2.7/lib-dynload/_hashlib.x86_64-linux-gnu.so
7ffbb2290000-7ffbb23d1000 rw-p 00000000 00:00 0
7ffbb23d1000-7ffbb2566000 r-xp 00000000 fd:00 2246221                    /lib/x86_64-linux-gnu/libc-2.24.so
7ffbb2566000-7ffbb2766000 ---p 00195000 fd:00 2246221                    /lib/x86_64-linux-gnu/libc-2.24.so
7ffbb2766000-7ffbb276a000 r--p 00195000 fd:00 2246221                    /lib/x86_64-linux-gnu/libc-2.24.so
7ffbb276a000-7ffbb276c000 rw-p 00199000 fd:00 2246221                    /lib/x86_64-linux-gnu/libc-2.24.so
7ffbb276c000-7ffbb2770000 rw-p 00000000 00:00 0
7ffbb2770000-7ffbb2873000 r-xp 00000000 fd:00 2246255                    /lib/x86_64-linux-gnu/libm-2.24.so
7ffbb2873000-7ffbb2a72000 ---p 00103000 fd:00 2246255                    /lib/x86_64-linux-gnu/libm-2.24.so
7ffbb2a72000-7ffbb2a73000 r--p 00102000 fd:00 2246255                    /lib/x86_64-linux-gnu/libm-2.24.so
7ffbb2a73000-7ffbb2a74000 rw-p 00103000 fd:00 2246255                    /lib/x86_64-linux-gnu/libm-2.24.so
7ffbb2a74000-7ffbb2a8d000 r-xp 00000000 fd:00 2246440                    /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffbb2a8d000-7ffbb2c8c000 ---p 00019000 fd:00 2246440                    /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffbb2c8c000-7ffbb2c8d000 r--p 00018000 fd:00 2246440                    /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffbb2c8d000-7ffbb2c8e000 rw-p 00019000 fd:00 2246440                    /lib/x86_64-linux-gnu/libz.so.1.2.8
7ffbb2c8e000-7ffbb2c90000 r-xp 00000000 fd:00 2246397                    /lib/x86_64-linux-gnu/libutil-2.24.so
7ffbb2c90000-7ffbb2e8f000 ---p 00002000 fd:00 2246397                    /lib/x86_64-linux-gnu/libutil-2.24.so
7ffbb2e8f000-7ffbb2e90000 r--p 00001000 fd:00 2246397                    /lib/x86_64-linux-gnu/libutil-2.24.so
7ffbb2e90000-7ffbb2e91000 rw-p 00002000 fd:00 2246397                    /lib/x86_64-linux-gnu/libutil-2.24.so
7ffbb2e91000-7ffbb2e94000 r-xp 00000000 fd:00 2246237                    /lib/x86_64-linux-gnu/libdl-2.24.so
7ffbb2e94000-7ffbb3093000 ---p 00003000 fd:00 2246237                    /lib/x86_64-linux-gnu/libdl-2.24.so
7ffbb3093000-7ffbb3094000 r--p 00002000 fd:00 2246237                    /lib/x86_64-linux-gnu/libdl-2.24.so
7ffbb3094000-7ffbb3095000 rw-p 00003000 fd:00 2246237                    /lib/x86_64-linux-gnu/libdl-2.24.so
7ffbb3095000-7ffbb30ad000 r-xp 00000000 fd:00 2246324                    /lib/x86_64-linux-gnu/libpthread-2.24.so
7ffbb30ad000-7ffbb32ac000 ---p 00018000 fd:00 2246324                    /lib/x86_64-linux-gnu/libpthread-2.24.so
7ffbb32ac000-7ffbb32ad000 r--p 00017000 fd:00 2246324                    /lib/x86_64-linux-gnu/libpthread-2.24.so
7ffbb32ad000-7ffbb32ae000 rw-p 00018000 fd:00 2246324                    /lib/x86_64-linux-gnu/libpthread-2.24.so
7ffbb32ae000-7ffbb32b2000 rw-p 00000000 00:00 0
7ffbb32b2000-7ffbb32d5000 r-xp 00000000 fd:00 2246201                    /lib/x86_64-linux-gnu/ld-2.24.so
7ffbb32da000-7ffbb33da000 rw-p 00000000 00:00 0
7ffbb340b000-7ffbb34cf000 rw-p 00000000 00:00 0
7ffbb34d4000-7ffbb34d5000 rw-p 00000000 00:00 0
7ffbb34d5000-7ffbb34d6000 r--p 00023000 fd:00 2246201                    /lib/x86_64-linux-gnu/ld-2.24.so
7ffbb34d6000-7ffbb34d7000 rw-p 00024000 fd:00 2246201                    /lib/x86_64-linux-gnu/ld-2.24.so
7ffbb34d7000-7ffbb34d8000 rw-p 00000000 00:00 0
7fffc342c000-7fffc344d000 rw-p 00000000 00:00 0                          [stack]
7fffc3513000-7fffc3516000 r--p 00000000 00:00 0                          [vvar]
7fffc3516000-7fffc3518000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted (core dumped)
``

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan I read swig documentation in many places + some source code, but not everything. As I understood from the documentation SWIG needs to have a mechanism to garbage collect objects, which are created inside of C++ method, were given outside, and it is user responsibility to remove the objects. To address such objects with python swig introduced *_OWN flag. So by documentation and sources I read I think we definitely don't need *_OWN flag, because we don't want the selectabl object to be removed, after we receive it from Select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why I wrote "In my logic", because I didn't dive deep enough for me when I investigated this issue. I didn't run gdb or read stacktrace. I just did series of experiments and implemented the refcnt by myself. But I still don't understand the whole code. I don't understand when we can get $result which are not python list? I still don't understand that.

%typemap(argout) swss::Selectable ** {
    PyObject* temp = NULL;
    if (!PyList_Check($result)) {
        temp = $result;
        $result = PyList_New(1);
        PyList_SetItem($result, 0, temp);
    }

Copy link
Contributor

@lguohan lguohan May 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the compiled code in swsscommon_wrap.cpp, it looks like result will not be PyList.

  arg3 = static_cast< int >(val3);
  result = (int)(arg1)->select(arg2,arg3);
  resultobj = SWIG_From_int(static_cast< int >(result));
  {
    PyObject* temp = NULL;
    if (!PyList_Check(resultobj)) {
      temp = resultobj;
      resultobj = PyList_New(1);
      PyList_SetItem(resultobj, 0, temp);
    }
    temp = SWIG_NewPointerObj(*arg2, SWIGTYPE_p_swss__Selectable, SWIG_POINTER_OWN);
    PyList_Append(resultobj, temp);
  }
  return resultobj;
fail:
  return NULL;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right, swig should not own the pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lguohan
Sorry. I wasn't clear enough. At the first, when I started working with the problem and I reproduce the issue, I found that Selectable object is leaking. Then I found the code, which maps from c++ to python. Play with the code to make sure that this code caused the issue.
Then I played with the code and found that I need to modify ref_cnt. I played with refcnt, and I got crashes. I read documentation and found that *_OWN flag could be the problem. I checked the documentation, sources and another projects and tried with 0 instead of OWN flag and found that it works and I don't see the issue anymore.
But the code I had was ugly, I can't explain it. After that I start looking in to different examples on github and found swig_python_append_output. I didn't find the document for the function but I found the source code and it did the same as my ugly code. So I replace my code with the functions and got this patch with only two lines replaced (one of the partially).
After that I tested the solution thrously for some time.

@lguohan
Copy link
Contributor

lguohan commented May 24, 2020

can you turn your sample code into unit test? we have test_redis_ut.py in tests folder.

@lguohan
Copy link
Contributor

lguohan commented May 27, 2020

retest this please

@lguohan lguohan merged commit 5a32636 into sonic-net:master May 27, 2020
lguohan pushed a commit that referenced this pull request May 27, 2020
We have a memory leak in pyext. As result ledd, lddpd, and pmon could use a lot of memory.

I read swig documentation in many places + some source code, but not everything.
As I understood from the documentation SWIG needs to have a mechanism to garbage
collect objects, which are created inside of C++ method, were given outside, and
it is user responsibility to remove the objects. To address such objects with
python swig introduced *_OWN flag. So by documentation and sources I read I think
we definitely don't need *_OWN flag, because we don't want the selectabl object
to be removed, after we receive it from Select.
lguohan pushed a commit that referenced this pull request May 27, 2020
We have a memory leak in pyext. As result ledd, lddpd, and pmon could use a lot of memory.

I read swig documentation in many places + some source code, but not everything. As I understood from the documentation SWIG needs to have a mechanism to garbage collect objects, which are created inside of C++ method, were given outside, and it is user responsibility to remove the objects. To address such objects with python swig introduced *_OWN flag. So by documentation and sources I read I think we definitely don't need *_OWN flag, because we don't want the selectabl object to be removed, after we receive it from Select.
@pavel-shirshov pavel-shirshov deleted the pavelsh/fix_memory_leak branch May 27, 2020 22:00
prgeor pushed a commit to prgeor/sonic-swss-common that referenced this pull request Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants