Skip to content

Writes to Mongodb in Actionrunner are slow and cpu intensive with large documents #4798

@jdmeyer3

Description

@jdmeyer3

SUMMARY

When an action return ~4MB of data the actionrunner writing to MongoDB takes ~20 seconds for each document. During the writes, the CPU utilization spikes to 100%.

STACKSTORM VERSION

St2 v3.1.0

OS: CentOS 7.7.1908 Kernel 3.10.0-957.el7.x86_64
Kubernetes v1.14.1
Docker 19.03.2
Base Docker Image: CentOS 7.6.1810 (custom image)

Steps to reproduce the problem

In a Python 3 pack, create a python action with the following code

#!/usr/bin/env python3

import json

import requests
import datetime
from st2common.runners.base_action import Action


class RequestLargeDataset(Action):
    """
    Retrieves ~4MB of json data
    """

    def run(self, **kwargs):
        dataset = requests.get('https://data.ct.gov/api/views/rybz-nyjw/rows.json?accessType=DOWNLOAD')
        data = json.loads(dataset.text)
        return data

Expected Results

Low CPU utilization and the result returning within a relatively short time (<5 seconds)

Actual Results

CPU spiked to 100% for 20-30 seconds

image

adding some logs around st2actions.container.base.py:296 with

    def _update_status(self, liveaction_id, status, result, context):
        try:
            LOG.error("updating liveaction")
            LOG.debug('Setting status: %s for liveaction: %s', status, liveaction_id)
            liveaction_db, state_changed = self._update_live_action_db(
                liveaction_id, status, result, context)
            LOG.error("done updating liveaction")
        except Exception as e:
            LOG.exception(
                'Cannot update liveaction '
                '(id: %s, status: %s, result: %s).' % (
                    liveaction_id, status, result)
            )
            raise e

        try:
            LOG.error('updating execution')
            executions.update_execution(liveaction_db, publish=state_changed)
            LOG.error('done updating execution')
            extra = {'liveaction_db': liveaction_db}
            LOG.debug('Updated liveaction after run', extra=extra)
        except Exception as e:
            LOG.exception(
                'Cannot update action execution for liveaction '
                '(id: %s, status: %s, result: %s).' % (
                    liveaction_id, status, result)
            )
            raise e

        return liveaction_db

I'm getting logs like

2019-10-02 18:17:23,854 ERROR [-] updating liveaction
2019-10-02 18:17:28,963 ERROR [-] done updating liveaction
2019-10-02 18:17:28,964 ERROR [-] updating execution
2019-10-02 18:17:33,268 ERROR [-] done updating execution

Where it takes 10 seconds to update liveaction and another 5 seconds to update execution.

Additionally, I've done a cProfile against the actionrunner and I'm seeing the following
image

It looks like the CPU is spending most of its time building the mongoengine document object. May be related to MongoEngine/mongoengine#1230

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions