-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/bert inference #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new inference benchmark (412.language-bert) that performs sentence classification using a compact BERT model served via ONNX Runtime. The benchmark is added to the serverless benchmarks suite and integrates with the existing testing infrastructure. Additionally, the PR updates the benchmark data repository URL to a forked version.
- Added new BERT-based language inference benchmark for sentence classification
- Updated benchmark data repository references from
spcl/serverless-benchmarks-datatoMcLavish/serverless-benchmarks-data-dphpc - Enhanced mypy configuration to ignore missing imports for docker submodules
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| sebs/regression.py | Adds "412.language-bert" to the Python benchmarks list for regression testing |
| install.py | Updates git clone URL to forked benchmark data repository |
| .gitmodules | Updates submodule URL to forked benchmark data repository |
| docs/benchmarks.md | Adds documentation entry and description for the new language inference benchmark |
| benchmarks/400.inference/412.language-bert/python/requirements.txt* | Defines Python dependencies (numpy, onnxruntime-gpu, tokenizers) for multiple Python versions |
| benchmarks/400.inference/412.language-bert/python/package.sh | Provides packaging script for stripping unnecessary files and handling torch dependencies |
| benchmarks/400.inference/412.language-bert/python/init.sh | Initialization script (no additional setup required) |
| benchmarks/400.inference/412.language-bert/python/function.py | Main benchmark implementation with model loading, tokenization, and inference logic |
| benchmarks/400.inference/412.language-bert/input.py | Input generation and file upload utilities for the benchmark |
| benchmarks/400.inference/412.language-bert/config.json | Benchmark configuration specifying timeout, memory, languages, and required modules |
| .mypy.ini | Adds ignore rule for docker.* submodules to prevent type checking errors |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| with tarfile.open(archive_path, "r:gz") as tar: | ||
| tar.extractall(MODEL_DIRECTORY) |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using tar.extractall() without validation is vulnerable to path traversal attacks. Malicious archives could extract files outside the intended directory. Use tar.extractall(MODEL_DIRECTORY, filter='data') (Python 3.12+) or manually validate each member's path before extraction for older Python versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care
| if "CUDAExecutionProvider" not in available: | ||
| raise RuntimeError(f"CUDAExecutionProvider unavailable (have: {available})") | ||
|
|
||
| _session = ort.InferenceSession(onnx_path, providers=["CUDAExecutionProvider"]) |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code requires CUDAExecutionProvider but the benchmark uses onnxruntime-gpu. This creates a hard dependency on GPU availability, which may fail in CPU-only environments. Consider falling back to CPUExecutionProvider if CUDA is unavailable, or document this GPU requirement clearly in the benchmark configuration. The existing 411.image-recognition benchmark uses CPU-only inference for broader compatibility.
| if "CUDAExecutionProvider" not in available: | |
| raise RuntimeError(f"CUDAExecutionProvider unavailable (have: {available})") | |
| _session = ort.InferenceSession(onnx_path, providers=["CUDAExecutionProvider"]) | |
| if "CUDAExecutionProvider" in available: | |
| providers = ["CUDAExecutionProvider"] | |
| print("Using CUDAExecutionProvider for ONNX Runtime inference.") | |
| else: | |
| providers = ["CPUExecutionProvider"] | |
| print("CUDAExecutionProvider unavailable, falling back to CPUExecutionProvider for ONNX Runtime inference.") | |
| _session = ort.InferenceSession(onnx_path, providers=providers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care
| @@ -0,0 +1,3 @@ | |||
| numpy==1.24.4 | |||
| onnxruntime-gpu==1.16.3 | |||
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using onnxruntime-gpu requires CUDA dependencies and may not be compatible with all serverless environments. Consider using onnxruntime (CPU version) for better portability across different cloud platforms, or provide separate CPU and GPU requirement files. The comment on line 63-64 of function.py enforces GPU requirement, but many serverless platforms don't provide GPU access by default.
| onnxruntime-gpu==1.16.3 | |
| onnxruntime==1.16.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't care
| TORCH_DIR=".python_packages/lib/site-packages/torch" | ||
| if [ -d "$1/${TORCH_DIR}" ]; then | ||
| cd $1 | ||
| zip -qr torch.zip ${TORCH_DIR} | ||
| rm -rf ${TORCH_DIR} | ||
| cd ${CUR_DIR} | ||
| echo "Torch-zipped size $(du -sh $1 | cut -f1)" | ||
| fi |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This benchmark packages torch despite not using PyTorch in its requirements or implementation (uses ONNX Runtime instead). The torch packaging logic appears to be copied from the 411.image-recognition benchmark but is unnecessary here. Consider removing lines 28-35 to avoid confusion and reduce package size.
| TORCH_DIR=".python_packages/lib/site-packages/torch" | |
| if [ -d "$1/${TORCH_DIR}" ]; then | |
| cd $1 | |
| zip -qr torch.zip ${TORCH_DIR} | |
| rm -rf ${TORCH_DIR} | |
| cd ${CUR_DIR} | |
| echo "Torch-zipped size $(du -sh $1 | cut -f1)" | |
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NO
| # clone | ||
| else: | ||
| execute(f"git clone https://github.com/spcl/serverless-benchmarks-data.git {data_dir}") | ||
| execute(f"git clone https://github.com/McLavish/serverless-benchmarks-data-dphpc.git {data_dir}") |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repository URL change from spcl/serverless-benchmarks-data to McLavish/serverless-benchmarks-data-dphpc suggests this is a fork for a specific project (dphpc). Using a personal fork in production code may cause maintenance issues if the fork becomes outdated or unavailable. Consider using the official repository or clearly documenting why this fork is necessary.
| execute(f"git clone https://github.com/McLavish/serverless-benchmarks-data-dphpc.git {data_dir}") | |
| execute(f"git clone https://github.com/spcl/serverless-benchmarks-data.git {data_dir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who
| [submodule "benchmarks-data"] | ||
| path = benchmarks-data | ||
| url = https://github.com/spcl/serverless-benchmarks-data.git | ||
| url = https://github.com/McLavish/serverless-benchmarks-data-dphpc.git |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The submodule URL change to a personal fork (McLavish/serverless-benchmarks-data-dphpc) may cause synchronization issues with the main project. If this fork contains essential data for the new benchmark, consider contributing it back to the upstream repository or documenting the dependency clearly in the project README.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asked
First implementation of a LLM Language GPU Benchmark