Support multiple LitAPIs for inference process and endpoints #513

aniketmaurya · 2025-05-25T19:34:02Z

What does this PR do?

Users will be able to run multiple APIs in a single server.

Part of #492

Example

from transformers import pipeline
import litserve as ls

class SentimentAnalysisAPI(ls.LitAPI):
    def setup(self, device):
        self.model = pipeline("sentiment-analysis", model="stevhliu/my_awesome_model", device=device)

    def decode_request(self, request: dict):
        return request["text"]

    def predict(self, text):
        return self.model(text)[0]

class TextGenerationAPI(ls.LitAPI):
    def setup(self, device):
        self.generator = pipeline("text-generation", model="gpt2", device=device)

    def decode_request(self, request: dict):
        return request["prompt"]

    def predict(self, prompt):
        return self.generator(prompt)[0]["generated_text"]

if __name__ == "__main__":
    sentiment_api = SentimentAnalysisAPI(api_path="/classify-text")
    chat_api = TextGenerationAPI(api_path="/v1/chat/completions")

    server = ls.LitServer([sentiment_api, chat_api])
    server.run(port=8000)

Before submitting

Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

- Introduced LitAPIV2 class for enhanced API configuration with parameters like api_path, stream, loop, and spec. - Refactored LitServerV2 to accept LitAPI or a dictionary of LitAPI instances. - Removed deprecated max_batch_size and batch_timeout parameters from LitServerV2. - Updated pre_setup logic to handle streaming and specifications more effectively. - Improved error handling for api_path and loop parameters. - Adjusted response handling and logging for better clarity and maintainability.

- Enhanced LitAPI constructor to include api_path, stream, loop, and spec parameters. - Updated LitServer to utilize the new LitAPI structure, ensuring proper handling of request_timeout and pre_setup logic. - Improved error handling for api_path and loop parameters. - Adjusted response handling and logging for better clarity and maintainability. - Added a test for request timeout configuration in SlowLitAPI.

- Changed loop attribute to a private variable (_loop) with a property for lazy initialization. - Implemented logic to retrieve the default loop when _loop is set to "auto". - Updated test assertions to reference the correct attribute for response_queue_id in LitAPI.

- Added a check to ensure api_path starts with '/' and raises a ValueError for invalid paths. - Updated test to reference the correct api_path attribute in LitServer for accurate assertions.

- Introduced a migration warning for deprecated parameters (api_path, stream, loop, spec) in LitServer. - Updated LitAPI constructor to include new parameters for better configuration. - Changed spec attribute to a private variable (_spec) in LitAPI for improved encapsulation. - Enhanced error handling and logging for clarity during initialization. - Adjusted response handling to accommodate new structure and maintain backward compatibility.so

… request timeout configuration in test_batch.py.

…ement - Introduced _LitAPIConnector class to handle multiple LitAPI instances and streamline pre-setup and request timeout configuration. - Updated LitServer to use _LitAPIConnector for managing LitAPI instances, enhancing readability and maintainability. - Removed redundant checks and improved error handling related to request timeouts and streaming capabilities.

src/litserve/server.py

…of optional specifications - Updated pre_setup methods in LitAPI and related loops to accept an optional spec parameter, defaulting to the instance's spec if not provided. - Enhanced the pre_setup logic in _LitAPIConnector and LitServer to streamline initialization and improve readability. - Removed redundant spec handling in various loop implementations, ensuring consistency across the codebase.

- Updated the wrap_litserve_start function to initialize workers_setup_status and request_queue using the transport manager. - Modified launch_inference_worker to accept lit_api as a parameter, ensuring better integration with the server's API specifications.

… API integration - Introduced _init_manager method in LitServer to streamline worker setup and enhance readability. - Updated run_batched_loop and __call__ methods in BatchedLoop to utilize lit_api.spec, ensuring better integration with API specifications. - Adjusted test assertions to reflect changes in launch_inference_worker, improving test accuracy and maintainability.

… instance - Modified the instantiation of LitServer in the test to pass the OpenAIEmbeddingSpec directly to TestEmbedAPI, improving clarity and ensuring proper configuration for the test case.

ddiddi · 2025-05-27T04:53:54Z

Thank you for your efforts! @aniketmaurya @bhimrazy

…nt and API integration - Removed unnecessary manager return from launch_inference_worker in LitServer, simplifying the function's interface. - Updated wrap_litserve_start to reflect changes in launch_inference_worker, enhancing clarity in worker initialization. - Adjusted inference worker setup to utilize endpoint-specific identifiers for better tracking of worker statuses. - Enhanced LitSpec setup method to connect response and request queues directly from the server, improving integration.

- Changed data_streamer to a static method in LitServer, enhancing its accessibility without requiring an instance. - Updated references in OpenAISpec to utilize the new static method, improving code clarity and consistency in accessing the data streamer functionality.

Copilot

Pull Request Overview

Supports running multiple LitAPI instances per server by introducing a connector class, refactoring how timeouts, logging, and endpoints are wired, and updating tests to match the new method signatures.

Added _LitAPIConnector to centralize pre-setup, timeout checks, and logger queue assignment for one or more LitAPI instances.
Refactored LitServer to delegate setup, worker launching, and endpoint registration through the connector, and updated queue/timeout handling.
Updated tests in several modules to align with the new signatures (e.g., removing lit_spec args, changing launch_inference_worker parameters).

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
tests/test_openai_embedding.py	Updated `LitServer` instantiation to pass `spec` into the API.
tests/test_loops.py	Removed deprecated `lit_spec` and `stream` parameters from loops.
tests/test_litapi.py	Swapped `launch_inference_worker(1)` for `_init_manager` + new arg.
tests/test_lit_server.py	Adjusted assertions to expect `launch_inference_worker(server.lit_api)`.
tests/test_batch.py	Dropped redundant `lit_api_mock` from `run_batched_loop` call.
src/litserve/utils.py	Reworked `wrap_litserve_start` to use the new manager/queue setup.
src/litserve/specs/openai_embedding.py	Replaced `self._server` references with direct spec fields.
src/litserve/specs/openai.py	Unified stream vs. non-stream endpoint queue access via spec fields.
src/litserve/specs/base.py	Initialized `response_buffer`, `request_queue`, `response_queue_id` on specs.
src/litserve/server.py	Introduced `_LitAPIConnector`, overhauled pre-setup, worker launch, endpoint registration.
src/litserve/loops/streaming_loops.py	Adjusted loop signatures to remove external `lit_spec` arg.
src/litserve/loops/simple_loops.py	Moved `lit_spec` defaulting inside loop logic and removed `stream` arg.
src/litserve/loops/loops.py	Updated `inference_worker` to derive `lit_spec` and remove old args.
src/litserve/loggers.py	Changed logger setup to use `litapi_connector.set_logger_queue`.
src/litserve/api.py	Added default `pass` in `setup`, adjusted `spec` type hints.

Comments suppressed due to low confidence (2)

tests/test_lit_server.py:249

[nitpick] There’s no test covering the multi-API scenario. Add a test that initializes LitServer with a list of LitAPI instances to verify all are launched correctly.

server.launch_inference_worker.assert_called_with(server.lit_api)

src/litserve/loops/streaming_loops.py:39

[nitpick] The lit_spec parameter is positioned at the end here but appeared earlier in other loop signatures. For consistency, align parameter ordering across loop methods.

def run_streaming_loop(self, lit_api: LitAPI, request_queue: Queue, transport: MessageTransport, callback_runner: CallbackRunner, lit_spec: Optional[LitSpec] = None):

src/litserve/server.py

src/litserve/utils.py

for more information, see https://pre-commit.ci

- Introduced a dedicated request queue for each unique LitAPI endpoint in LitServer, improving request handling. - Updated launch_inference_worker to utilize the new request queue structure, enhancing clarity and maintainability. - Modified wrap_litserve_start to support multiple endpoints, ensuring proper initialization of inference workers for each API. - Improved code readability by replacing direct queue references with a helper method to retrieve the appropriate request queue.

- Changed api_path attribute in LitAPI to a private variable and added a property for better encapsulation. - Updated LitServer to retrieve request queues using the new api_path property, enhancing clarity in request handling. - Initialized api_path in LitSpec subclasses to set default endpoint paths, ensuring consistent API routing.

codecov · 2025-05-27T14:41:23Z

Codecov Report

Attention: Patch coverage is 95.73171% with 7 lines in your changes missing coverage. Please review.

Project coverage is 86%. Comparing base (dbec61b) to head (630cffe).
Report is 1 commits behind head on main.

Additional details and impacted files

@@         Coverage Diff         @@
##           main   #513   +/-   ##
===================================
  Coverage    86%    86%           
===================================
  Files        37     37           
  Lines      2418   2505   +87     
===================================
+ Hits       2080   2165   +85     
- Misses      338    340    +2

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Borda · 2025-05-27T15:15:26Z

TODOs:

Detect duplicate endpoint paths

Don't allow streaming with non-stream

Is it for this PR or next one?

- Added a private method _detect_path_collision to LitServer to check for duplicate api_path values among registered LitAPI instances. - Introduced tests to validate behavior when multiple endpoints share the same path, ensuring proper error handling for reserved paths.

- Introduced a private method _check_mixed_streaming_configuration to validate that all endpoints have a consistent streaming configuration. - Added a test to ensure proper error handling when mixed streaming settings are provided for multiple endpoints, raising a ValueError for inconsistent configurations.

aniketmaurya · 2025-05-27T15:52:17Z

there were in TODO in this PR itself @Borda! I just updated to resolve them.

Copilot

Pull Request Overview

This PR refactors the LitServe codebase to support running multiple LitAPI instances on a single server, improving how endpoints are registered and how inference workers are launched. Key changes include updating test cases to pass specs via the LitAPI constructor, centralizing API endpoint management using a new _LitAPIConnector, and removing redundant parameters in loop methods to rely directly on lit_api.spec.

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

File	Description
tests/test_openai_embedding.py	Updated LitServer instantiation to pass spec via LitAPI constructor.
tests/test_multiple_endpoints.py	Added tests for handling multiple endpoints with distinct api paths.
tests/test_loops.py	Adjusted loop function calls to remove deprecated spec parameters.
src/litserve/*	Refactored server, API, and loop components to support multiple LitAPIs and improve endpoint registration.

Comments suppressed due to low confidence (1)

src/litserve/api.py:92

[nitpick] The introduction of the _api_path attribute along with the api_path property (which defers to spec.api_path when available) could lead to ambiguity. Consider consolidating the source of truth for the API path to avoid confusion during future maintenance.

self._api_path = api_path

src/litserve/server.py

src/litserve/loops/base.py

ywh-my · 2025-06-03T11:34:28Z

Thanks for your hard work! I can not find the branch "endpoint-2" . Are you still developing it ？

aniketmaurya · 2025-06-03T13:07:41Z

hi @ywh-my, it has been merged into the main branch. You can install the latest version of LitServe to use this feature. Here is the docs for it - https://lightning.ai/docs/litserve/features/multiple-endpoints

ywh-my · 2025-06-04T07:31:11Z

Thank you !!!!

aniketmaurya added 12 commits May 25, 2025 13:34

start

39b9995

update

ad6b344

Implement validation for api_path in LitServer initialization

08024dc

- Added a check to ensure api_path starts with '/' and raises a ValueError for invalid paths. - Updated test to reference the correct api_path attribute in LitServer for accurate assertions.

Merge branch 'main' into aniket/multiple-endpoints

900ad8a

update

4c9f3ed

fix

429bddc

Remove debug print statements from api.py and server.py; add test for…

9be1b99

… request timeout configuration in test_batch.py.

aniketmaurya changed the title ~~Refactor LitServer to utilize _LitAPIConnector for improved API management~~ Support multiple LitAPIs for inference process May 25, 2025

aniketmaurya changed the title ~~Support multiple LitAPIs for inference process~~ Support multiple LitAPIs for inference process and endpoints May 25, 2025

bhimrazy approved these changes May 26, 2025

View reviewed changes

src/litserve/server.py Outdated Show resolved Hide resolved

src/litserve/server.py Outdated Show resolved Hide resolved

src/litserve/server.py Outdated Show resolved Hide resolved

Base automatically changed from aniket/multiple-endpoints to main May 26, 2025 14:47

aniketmaurya added 7 commits May 26, 2025 15:55

Merge branch 'main' into endpoint-2

66000aa

fixes

00ba243

update

e81bd53

Update test_openai_embedding.py to pass spec directly to TestEmbedAPI…

0f8a6e8

… instance - Modified the instantiation of LitServer in the test to pass the OpenAIEmbeddingSpec directly to TestEmbedAPI, improving clarity and ensuring proper configuration for the test case.

aniketmaurya mentioned this pull request May 27, 2025

start multiple types of models, with any number of instances for each #514

Open

aniketmaurya added 2 commits May 27, 2025 10:49

aniketmaurya requested a review from Copilot May 27, 2025 10:10

Copilot AI reviewed May 27, 2025

View reviewed changes

src/litserve/server.py Outdated Show resolved Hide resolved

src/litserve/server.py Show resolved Hide resolved

src/litserve/utils.py Outdated Show resolved Hide resolved

fix

c6b91bd

pre-commit-ci bot and others added 6 commits May 27, 2025 12:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

8170b3f

for more information, see https://pre-commit.ci

fix

58a04ad

fix test

0d41e21

fix

5f2a386

fix windows

ceb1b7b

aniketmaurya marked this pull request as ready for review May 27, 2025 14:49

aniketmaurya requested review from Borda, KaelanDt, ethanwharris, justusschock, k223kim, lantiga and tchaton as code owners May 27, 2025 14:49

aniketmaurya added 2 commits May 27, 2025 16:39

aniketmaurya requested a review from Copilot May 27, 2025 15:55

Copilot AI reviewed May 27, 2025

View reviewed changes

src/litserve/server.py Show resolved Hide resolved

src/litserve/server.py Show resolved Hide resolved

justusschock approved these changes May 27, 2025

View reviewed changes

src/litserve/loops/base.py Show resolved Hide resolved

aniketmaurya merged commit 46720e4 into main May 27, 2025
21 checks passed

aniketmaurya deleted the endpoint-2 branch May 27, 2025 17:44

This was referenced May 27, 2025

Is it possible to support multiple endpoints for one server? #271

Closed

Multiple endpoints roadmap #492

Open

Support multiple LitAPIs for inference process and endpoints #513

Support multiple LitAPIs for inference process and endpoints #513

Uh oh!

Conversation

aniketmaurya commented May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Example

PR review

Did you have fun?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ddiddi commented May 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Borda commented May 27, 2025

Uh oh!

aniketmaurya commented May 27, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ywh-my commented Jun 3, 2025

Uh oh!

aniketmaurya commented Jun 3, 2025

Uh oh!

ywh-my commented Jun 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aniketmaurya commented May 25, 2025 •

edited

Loading

codecov bot commented May 27, 2025 •

edited

Loading