fix: resolve issue #131 - test Improve get_dataframe stub to return empty cu#132
fix: resolve issue #131 - test Improve get_dataframe stub to return empty cu#132mangodxd wants to merge 1 commit into
Conversation
…aframe stub to return empty cu
Reviewer's guide (collapsed on small PRs)Reviewer's GuideAdjusts tensor data handling in the ModelMesh parser so that missing raw tensor data now results in an empty pandas DataFrame instead of unimplemented errors, resolving issue #131. File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
📝 WalkthroughWalkthroughThe parser's Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
_get_tensor_datafunction is annotated to returnnp.ndarray, but now sometimes returns apd.DataFrame; consider either preserving a consistent return type (e.g., an emptynp.ndarray) or updating type hints and all call sites to avoid subtle type bugs. - Swallowing the
NotImplementedErrorand returning an empty structure may mask real issues; consider at least logging a warning or using a more explicit error/return path so callers can distinguish between 'no data' and 'unsupported/raw contents not implemented'.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `_get_tensor_data` function is annotated to return `np.ndarray`, but now sometimes returns a `pd.DataFrame`; consider either preserving a consistent return type (e.g., an empty `np.ndarray`) or updating type hints and all call sites to avoid subtle type bugs.
- Swallowing the `NotImplementedError` and returning an empty structure may mask real issues; consider at least logging a warning or using a more explicit error/return path so callers can distinguish between 'no data' and 'unsupported/raw contents not implemented'.
## Individual Comments
### Comment 1
<location path="src/service/data/modelmesh_parser.py" line_range="174" />
<code_context>
+ return ModelMeshPayloadParser._extract_tensor_data(tensor, tensor_type)
+ elif hasattr(request_obj, "raw_input_contents") and request_obj.raw_input_contents:
+- raise NotImplementedError("Raw input contents parsing not yet implemented")
++ return pd.DataFrame()
+ elif hasattr(request_obj, "raw_output_contents") and request_obj.raw_output_contents:
+- raise NotImplementedError("Raw output contents parsing not yet implemented")
</code_context>
<issue_to_address>
**issue (bug_risk):** Returning an empty DataFrame may mask missing or malformed input data.
This path used to raise `NotImplementedError`, clearly signaling that raw input contents were unsupported. Returning `pd.DataFrame()` turns that into a silent no-op, making it impossible to distinguish between “no data” and “unsupported/malformed input,” and also conflicts with the function’s `np.ndarray` return type. Please either keep an explicit exception (or a more specific one), or introduce logging and a distinct sentinel (e.g., `None`) so callers can reliably detect this case.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| return ModelMeshPayloadParser._extract_tensor_data(tensor, tensor_type) | ||
| elif hasattr(request_obj, "raw_input_contents") and request_obj.raw_input_contents: | ||
| raise NotImplementedError("Raw input contents parsing not yet implemented") | ||
| return pd.DataFrame() |
There was a problem hiding this comment.
issue (bug_risk): Returning an empty DataFrame may mask missing or malformed input data.
This path used to raise NotImplementedError, clearly signaling that raw input contents were unsupported. Returning pd.DataFrame() turns that into a silent no-op, making it impossible to distinguish between “no data” and “unsupported/malformed input,” and also conflicts with the function’s np.ndarray return type. Please either keep an explicit exception (or a more specific one), or introduce logging and a distinct sentinel (e.g., None) so callers can reliably detect this case.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/service/data/modelmesh_parser.py`:
- Around line 173-176: The fallback branches in _get_tensor_data currently
return pd.DataFrame() for request_obj.raw_input_contents and
raw_output_contents, causing a return-type regression because callers (e.g., the
code that calls .reshape(shape) on _get_tensor_data's result) expect an
np.ndarray; change those returns to return an empty numpy array (e.g.,
np.array([]) or np.empty(0)) so the function conforms to its np.ndarray
annotation and reshape calls won't raise AttributeError, and ensure numpy (np)
is imported/available in modelmesh_parser.py; keep the checks on
hasattr(request_obj, "raw_input_contents") and "raw_output_contents" as-is, only
replacing pd.DataFrame() with an appropriate np.ndarray.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d41e7a51-4511-4b85-a4b8-d5666c9f8d36
📒 Files selected for processing (2)
ai_fix.patchsrc/service/data/modelmesh_parser.py
| elif hasattr(request_obj, "raw_input_contents") and request_obj.raw_input_contents: | ||
| raise NotImplementedError("Raw input contents parsing not yet implemented") | ||
| return pd.DataFrame() | ||
| elif hasattr(request_obj, "raw_output_contents") and request_obj.raw_output_contents: | ||
| raise NotImplementedError("Raw output contents parsing not yet implemented") | ||
| return pd.DataFrame() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the type mismatch and ndarray-only usage around the changed logic.
rg -n -C3 'def _get_tensor_data|return pd\.DataFrame\(\)|_get_tensor_data\(|\.reshape\(' src/service/data/modelmesh_parser.py
# Expected:
# - `_get_tensor_data` raw-content branches return `pd.DataFrame()`
# - Callers invoke `.reshape(...)` on the returned value
# This confirms the contract/runtime mismatch.Repository: trustyai-explainability/trustyai-service
Length of output: 1694
🏁 Script executed:
# Check if DataFrame.reshape exists and how it differs from numpy
python3 << 'EOF'
import pandas as pd
import numpy as np
# Check if DataFrame has reshape
df = pd.DataFrame()
print("DataFrame has reshape:", hasattr(df, 'reshape'))
# Check numpy array reshape
arr = np.array([])
print("ndarray has reshape:", hasattr(arr, 'reshape'))
# Test what happens if we try to call reshape on DataFrame
try:
df.reshape((1, 1))
print("DataFrame.reshape() works")
except AttributeError as e:
print(f"DataFrame.reshape() raises AttributeError: {e}")
EOFRepository: trustyai-explainability/trustyai-service
Length of output: 203
🏁 Script executed:
# Search for all uses of _get_tensor_data return value
rg -n '_get_tensor_data' src/service/data/modelmesh_parser.pyRepository: trustyai-explainability/trustyai-service
Length of output: 372
🏁 Script executed:
# Check if there are any other uses or patterns that might assume DataFrame type
rg -n -B2 -A5 '_get_tensor_data\(' src/service/data/modelmesh_parser.pyRepository: trustyai-explainability/trustyai-service
Length of output: 1106
🏁 Script executed:
# Check imports at top of file to confirm pd and np usage
head -30 src/service/data/modelmesh_parser.py | grep -E "^import|^from"Repository: trustyai-explainability/trustyai-service
Length of output: 210
🏁 Script executed:
# Verify the complete context of both conditional branches
sed -n '169,178p' src/service/data/modelmesh_parser.pyRepository: trustyai-explainability/trustyai-service
Length of output: 693
Return-type regression in raw-content fallback
Lines 174 and 176 return pd.DataFrame() from _get_tensor_data, but the method is annotated to return np.ndarray and callers on lines 147 and 195 invoke .reshape(shape) on the result. Since pd.DataFrame does not have a .reshape() method, this causes an AttributeError at runtime when raw-content paths are exercised.
Proposed fix
- elif hasattr(request_obj, "raw_input_contents") and request_obj.raw_input_contents:
- return pd.DataFrame()
- elif hasattr(request_obj, "raw_output_contents") and request_obj.raw_output_contents:
- return pd.DataFrame()
+ elif hasattr(request_obj, "raw_input_contents") and request_obj.raw_input_contents:
+ return np.array([])
+ elif hasattr(request_obj, "raw_output_contents") and request_obj.raw_output_contents:
+ return np.array([])- if len(shape) > 1:
+ if len(shape) > 1 and getattr(data, "size", 0) > 0:
data = data.reshape(shape)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/service/data/modelmesh_parser.py` around lines 173 - 176, The fallback
branches in _get_tensor_data currently return pd.DataFrame() for
request_obj.raw_input_contents and raw_output_contents, causing a return-type
regression because callers (e.g., the code that calls .reshape(shape) on
_get_tensor_data's result) expect an np.ndarray; change those returns to return
an empty numpy array (e.g., np.array([]) or np.empty(0)) so the function
conforms to its np.ndarray annotation and reshape calls won't raise
AttributeError, and ensure numpy (np) is imported/available in
modelmesh_parser.py; keep the checks on hasattr(request_obj,
"raw_input_contents") and "raw_output_contents" as-is, only replacing
pd.DataFrame() with an appropriate np.ndarray.
Added error handling to prevent unhandled exceptions.
Changes
src/service/data/modelmesh_parser.pyTesting
Test with the reproduction steps from the issue to confirm the exception is now caught.
Fixes #131
Summary by Sourcery
Bug Fixes:
Summary by CodeRabbit