Report disconnection event immediately when hanging up a call#2582
Report disconnection event immediately when hanging up a call#2582
Conversation
| * will be immediately reported to app and the call slot can be reused. | ||
| * The only exception is if media transport creation is in progress | ||
| * during call setup. In this case, the hangup will be pending upon | ||
| * the completion of the media transport. |
There was a problem hiding this comment.
For the exception condition, will the app still be notified immediately?
There was a problem hiding this comment.
No, the app will be notified upon completion of media transport.
There was a problem hiding this comment.
Why is that?
Not sure if this is relevant. With trickle ICE, media transport completion may be reported immediately, while internally ICE is actually still gathering candidates. So for this case, which completion will the pjsua-lib wait for?
There was a problem hiding this comment.
It is the current behavior of call hangup, i.e. to delay the hangup until media transport creation completes:
https://github.com/pjsip/pjproject/blob/master/pjsip/src/pjsua-lib/pjsua_call.c#L2760
The reason this behavior is not changed is because of the following complexity:
In order to support immediate hangup, we move the current pjsua_var.call instance into another variable pjsua_var.hangup_call and handle the hangup process from this new variable. For media related callbacks however, since currently we are using &call.media[i] as the user data, this will point to the old call media instance. It is possible to overcome this issue, by:
- Creating the user data in the heap (using dialog's pool?), perhaps a struct of { dialog and media index }.
- Modifying
pjsua_media.ccallbacks to use this new user data.
But there is still the complexity of race condition, i.e. since the media transport creation is async, there is a possibility that the media callback will happen at the same time with call hangup. This may require us to hold dialog lock in all media related callbacks. So I'm not sure if the potentially major modifications required is worth the advantage, i.e. to be able to immediately hangup the call when async med tp creation is in progress in the beginning of the call (again, note that sync media reinit in the middle of the call should be unaffected by this).
As for trickle ICE, I believe it will be upon the report of completion, because the hangup will proceed in pjsua_call.c's on_make/incoming_call_med_tp_complete(). I'm not sure if this can cause any complications though, since trickle ICE is a new feature.
There was a problem hiding this comment.
As the ticket is about immediate hangup notification and handle the rest in background, I am thinking that media transport creation completion should be part of things to be done in background.
Re: &call.media[i] user data in media callbacks, have not checked the patch details, but perhaps the problem raises because the call instance may now be reused, right? What if it cannot be reused until the hangup process completed, e.g: simply add a flag hangingup in pjsua_call and alloc_call_id() filters out call slot with that flag set. This may be a completely different approach compared to this current patch (which seems to move the hanging up call instance to pjsua_var.hangup_call, cmiiw).
Re: trickle ICE, from PJSUA point of view, create_ice_media_transport() will set call_med->tp_ready to PJ_SUCCESS (instead of PJ_EPENDING as in current behavior). So perhaps PJSUA will try to destroy the call_med->tp immediately on call hangup, may be safe though.
There was a problem hiding this comment.
A few notes when trying to implement the new approach:
In general, there's a problem with the requirement in PJSUA-LIB should report disconnection event immediately after pjsua_call_hangup() is called #1049: cease any PJSUA call related callback once call is declared to be disconnected. More about this below.
on_stream_destroyed()will be called after the DISCONNECTEDon_call_state(). But then, so is the current behaviour. Except that since we now report the DISCONNECTED state much earlier, the delay can be longer between the two callbacks. Trying to callon_stream_destroyed()earlier poses two problems:
- The media will already be deactivated upon call disconnection, resulting in call dump not showing any useful end-of-call statistics. But this is minor and the workaround is probably to do the call dump before the media deinit.
Ic.
- The bigger problem is the async media transport creation, so no matter what, we still need to wait for it to complete.
If on_stream_created() is not called yet (currently this should be the case when med tp is not completed yet), we can skip `on_stream_destroyed()' perhaps?
- How about
on_call_tsx_state()? In Callback on_tsx_state_changed() not invoked on call disconnection #1627 (Callback on_tsx_state_changed() not invoked on call disconnection), the workaround is to call it beforeon_call_state(DISCONNECTED), but since we now report the DISCONNECTED state earlier inpjsua_call_hangup(), we don't have thepjsip_event.
Perhaps we can invoke the callback using user event?
There was a problem hiding this comment.
- The media will already be deactivated upon call disconnection, resulting in call dump not showing any useful end-of-call statistics. But this is minor and the workaround is probably to do the call dump before the media deinit.
Calling pjsua_call_dump() in the application's on_stream_destroyed() seems to solve this issue.
If
on_stream_created()is not called yet (currently this should be the case when med tp is not completed yet), we can skip `on_stream_destroyed()' perhaps?
Yes.
- How about
on_call_tsx_state()? In Callback on_tsx_state_changed() not invoked on call disconnection #1627 (Callback on_tsx_state_changed() not invoked on call disconnection), the workaround is to call it beforeon_call_state(DISCONNECTED), but since we now report the DISCONNECTED state earlier inpjsua_call_hangup(), we don't have thepjsip_event.
Perhaps we can invoke the callback using user event?
on_call_tsx_state() also has pjsip_transaction * parameter. Passing NULL may cause application to crash.
There was a problem hiding this comment.
on_call_tsx_state()also haspjsip_transaction *parameter. Passing NULL may cause application to crash.
Ah okay.
For immediate hangup, perhaps we should just skip on_call_tsx_state() altogether as any tsx state change, if any, happens in background. And the docs should explicitly specify this.
There was a problem hiding this comment.
About calling pjsua_call_dump() in on_stream_destroyed().
With this change, if we receive the hang up, the call_dump appears as "DISCONNCTD", but if we are the ones who hang up the call, it appears as "CONFIRMED".
Before, in on_call_state, in all cases the call dump appears as "DISCONNCTD".
There was a problem hiding this comment.
Will continue this discussion in the relevant PR #2600 instead.
|
Closing this. Used a different approach in #2600. |
To continue the work in #1049 and close it.
Add a new API
pjsua_call_hangup2()which allows user the option to immediately hangup a call. The call slot can then be reused almost immediately while the hangup process itself will be handled in the background. The only exception is if the asynchronous media transport creation is in progress during initial call setup. In this case, the hangup will be pending upon the completion of the media transport. Note that any media transport (re-)creation, such as during reinit media, once the call has been confirmed will not pend the hangup process, since the media tp recreation will be synchronous while acquiring dialog lock.Application will not receive any PJSUA call related callback once call is declared to be disconnected.
Retry creating and sending BYE after a period of
CALL_HANGUP_RETRY_INTERVALif it fails. If this keeps happening afterCALL_HANGUP_MAX_RETRYtimes, we will hard-terminate the call, i.e. cleanup the call without sending any BYE to the remote.