Skip to content

Report disconnection event immediately when hanging up a call#2582

Closed
sauwming wants to merge 2 commits intomasterfrom
call-hangup
Closed

Report disconnection event immediately when hanging up a call#2582
sauwming wants to merge 2 commits intomasterfrom
call-hangup

Conversation

@sauwming
Copy link
Copy Markdown
Member

To continue the work in #1049 and close it.

  • Add a new API pjsua_call_hangup2() which allows user the option to immediately hangup a call. The call slot can then be reused almost immediately while the hangup process itself will be handled in the background. The only exception is if the asynchronous media transport creation is in progress during initial call setup. In this case, the hangup will be pending upon the completion of the media transport. Note that any media transport (re-)creation, such as during reinit media, once the call has been confirmed will not pend the hangup process, since the media tp recreation will be synchronous while acquiring dialog lock.

  • Application will not receive any PJSUA call related callback once call is declared to be disconnected.

  • Retry creating and sending BYE after a period of CALL_HANGUP_RETRY_INTERVAL if it fails. If this keeps happening after CALL_HANGUP_MAX_RETRY times, we will hard-terminate the call, i.e. cleanup the call without sending any BYE to the remote.

* will be immediately reported to app and the call slot can be reused.
* The only exception is if media transport creation is in progress
* during call setup. In this case, the hangup will be pending upon
* the completion of the media transport.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the exception condition, will the app still be notified immediately?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the app will be notified upon completion of media transport.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that?

Not sure if this is relevant. With trickle ICE, media transport completion may be reported immediately, while internally ICE is actually still gathering candidates. So for this case, which completion will the pjsua-lib wait for?

Copy link
Copy Markdown
Member Author

@sauwming sauwming Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the current behavior of call hangup, i.e. to delay the hangup until media transport creation completes:
https://github.com/pjsip/pjproject/blob/master/pjsip/src/pjsua-lib/pjsua_call.c#L2760

The reason this behavior is not changed is because of the following complexity:
In order to support immediate hangup, we move the current pjsua_var.call instance into another variable pjsua_var.hangup_call and handle the hangup process from this new variable. For media related callbacks however, since currently we are using &call.media[i] as the user data, this will point to the old call media instance. It is possible to overcome this issue, by:

  • Creating the user data in the heap (using dialog's pool?), perhaps a struct of { dialog and media index }.
  • Modifying pjsua_media.c callbacks to use this new user data.

But there is still the complexity of race condition, i.e. since the media transport creation is async, there is a possibility that the media callback will happen at the same time with call hangup. This may require us to hold dialog lock in all media related callbacks. So I'm not sure if the potentially major modifications required is worth the advantage, i.e. to be able to immediately hangup the call when async med tp creation is in progress in the beginning of the call (again, note that sync media reinit in the middle of the call should be unaffected by this).

As for trickle ICE, I believe it will be upon the report of completion, because the hangup will proceed in pjsua_call.c's on_make/incoming_call_med_tp_complete(). I'm not sure if this can cause any complications though, since trickle ICE is a new feature.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the ticket is about immediate hangup notification and handle the rest in background, I am thinking that media transport creation completion should be part of things to be done in background.

Re: &call.media[i] user data in media callbacks, have not checked the patch details, but perhaps the problem raises because the call instance may now be reused, right? What if it cannot be reused until the hangup process completed, e.g: simply add a flag hangingup in pjsua_call and alloc_call_id() filters out call slot with that flag set. This may be a completely different approach compared to this current patch (which seems to move the hanging up call instance to pjsua_var.hangup_call, cmiiw).

Re: trickle ICE, from PJSUA point of view, create_ice_media_transport() will set call_med->tp_ready to PJ_SUCCESS (instead of PJ_EPENDING as in current behavior). So perhaps PJSUA will try to destroy the call_med->tp immediately on call hangup, may be safe though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few notes when trying to implement the new approach:

  • In general, there's a problem with the requirement in PJSUA-LIB should report disconnection event immediately after pjsua_call_hangup() is called #1049: cease any PJSUA call related callback once call is declared to be disconnected. More about this below.

  • on_stream_destroyed() will be called after the DISCONNECTED on_call_state(). But then, so is the current behaviour. Except that since we now report the DISCONNECTED state much earlier, the delay can be longer between the two callbacks. Trying to call on_stream_destroyed() earlier poses two problems:

    • The media will already be deactivated upon call disconnection, resulting in call dump not showing any useful end-of-call statistics. But this is minor and the workaround is probably to do the call dump before the media deinit.

Ic.

  • The bigger problem is the async media transport creation, so no matter what, we still need to wait for it to complete.

If on_stream_created() is not called yet (currently this should be the case when med tp is not completed yet), we can skip `on_stream_destroyed()' perhaps?

Perhaps we can invoke the callback using user event?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The media will already be deactivated upon call disconnection, resulting in call dump not showing any useful end-of-call statistics. But this is minor and the workaround is probably to do the call dump before the media deinit.

Calling pjsua_call_dump() in the application's on_stream_destroyed() seems to solve this issue.

If on_stream_created() is not called yet (currently this should be the case when med tp is not completed yet), we can skip `on_stream_destroyed()' perhaps?

Yes.

  • How about on_call_tsx_state()? In Callback on_tsx_state_changed() not invoked on call disconnection #1627 (Callback on_tsx_state_changed() not invoked on call disconnection), the workaround is to call it before on_call_state(DISCONNECTED), but since we now report the DISCONNECTED state earlier in pjsua_call_hangup(), we don't have the pjsip_event.
    Perhaps we can invoke the callback using user event?

on_call_tsx_state() also has pjsip_transaction * parameter. Passing NULL may cause application to crash.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on_call_tsx_state() also has pjsip_transaction * parameter. Passing NULL may cause application to crash.

Ah okay.
For immediate hangup, perhaps we should just skip on_call_tsx_state() altogether as any tsx state change, if any, happens in background. And the docs should explicitly specify this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About calling pjsua_call_dump() in on_stream_destroyed().
With this change, if we receive the hang up, the call_dump appears as "DISCONNCTD", but if we are the ones who hang up the call, it appears as "CONFIRMED".
Before, in on_call_state, in all cases the call dump appears as "DISCONNCTD".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will continue this discussion in the relevant PR #2600 instead.

@sauwming
Copy link
Copy Markdown
Member Author

sauwming commented Dec 4, 2020

Closing this. Used a different approach in #2600.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PJSUA-LIB should report disconnection event immediately after pjsua_call_hangup() is called

3 participants