Skip to content

Find the latency in async mode #481

@aniketmaurya

Description

@aniketmaurya

🐛 Bug

It seems like there is a fixed latency while processing concurrent requests. Time for 1 request and 1000 requests is similar.

1 request: 6.7 seconds

Image

1000 concurrent requests: 7.122 seconds

Image

cc: @bhimrazy

To Reproduce

Attach a Lightning Studio which is fully reproducible (code, dependencies, environment, etc...) to reproduce this:

  1. Create a Studio.
  2. Reproduce the issue in the Studio.
  3. Publish the Studio.
  4. Paste the Studio link here.

Code sample

Expected behavior

Environment

If you published a Studio with your bug report, we can automatically get this information. Otherwise, please describe:

  • PyTorch/Jax/Tensorflow Version (e.g., 1.0):
  • OS (e.g., Linux):
  • How you installed PyTorch (conda, pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions