Skip to content

Conversation

@mattbit
Copy link
Member

@mattbit mattbit commented Aug 21, 2023

No description provided.

@linear
Copy link

linear bot commented Aug 21, 2023

GSK-1589 Infinite loop in analytics (mixpanel)

When api.mixpanel.com is not reachable:

  • analytics makes millions of connection attempts in the background
  • analytics threads block the python process exit
  • analytics eats lots of CPU after just importing giskard, and forever in the background as it desperately tries to connect to mixpanel

How to reproduce: block connections to api.mixpanel.com and run import giskard.

Issues

analytics_method uses analytics.track but analytics.track is wrapped by analytics_method. This generates an infinite loop.

One solution can be moving the current track method to a private _track, then define the public track as:

    analytics_method
    def track(self, event_name, properties=None, meta=None, force=False):
        return self._track(event_name, properties=properties, meta=meta, force=force)

So that the _track can still be used without the analytics_method wrapping. Thus in the decorator we can call the unwrapped _track

try:
    analytics._track("tracking error", {"error": str(e)})
except BaseException:  # NOSONAR
    pass

This avoids falling in the infinite loop.

But there is a second problem: we launch analytics._track in a thread, so the try-catch does not really work (the call will return immediately, the thread will run in the background and has no access to the context anymore). The thread exceptions will be handled by threading.excepthook that we currently overwrite to track the errors, causing again an infinite loop…

The whole threading code should be revised. Maybe the higher level concurrent.futures.ThreadPoolExecutor can provide a better way to handle this (with futures collection at shutdown inside the GiskardAnalyticsCollector instance).

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

77.8% 77.8% Coverage
0.0% 0.0% Duplication

@Inokinoki Inokinoki changed the title Analytics infinite loop [GSK-1589] Analytics infinite loop Sep 4, 2023
@Hartorn Hartorn force-pushed the task/GSK-1589-analytics-infinite-loop branch from b0e6203 to f4a95e1 Compare September 28, 2023 08:19
@Hartorn Hartorn marked this pull request as ready for review September 28, 2023 08:21
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@Hartorn Hartorn self-assigned this Sep 28, 2023
@Hartorn
Copy link
Member

Hartorn commented Sep 28, 2023

Using report error filter out the exception only when it's Giskard one.
If connection to mix panel fails, it will not raise a giskard exception so it will no be caught again

Copy link
Member Author

@mattbit mattbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hartorn ok for me, just ensure you have the pre-commit hook installed to run black & ruff.

(Can’t approve because I contributed to the PR)



def _report_error(e):
def _report_error(e, error_type = "python error"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style

@Hartorn Hartorn merged commit 41df70b into main Sep 28, 2023
@Hartorn Hartorn deleted the task/GSK-1589-analytics-infinite-loop branch September 28, 2023 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants