Generic KServe detector integration #9

srikartondapu · 2025-10-07T19:35:43Z

Summary

Add generic KServe detector actions supporting any model format
Support sequence classification, token classification, and binary detectors
Enable parallel detector execution via dynamic registry
Add KServeDetectorConfig to rails config schema
Configuration-driven detector management (no code rebuilds required)

Description

This PR introduces a generic integration framework for KServe-hosted detection models in NeMo Guardrails. The implementation enables teams to add, remove, or configure safety detectors purely through ConfigMap updates - eliminating the need for code changes or container rebuilds.

Key capabilities:

Format-agnostic parsing: Automatically handles binary classification, sequence classification, and token classification responses
Parallel execution: All configured detectors run simultaneously for minimal latency
Dynamic registry: Detectors are discovered from the kserve_detectors configuration at runtime
Production-validated: Tested on OpenShift with Granite Guardian (toxicity), jailbreak classifier, and Piiranha (PII detection)

Implementation:

Added nemoguardrails/library/kserve_detector/actions.py with generic detector actions
Modified nemoguardrails/rails/llm/config.py to support KServeDetectorConfig class and detector registry

This addresses the operational overhead of deploying custom detectors by making detector management configuration-driven rather than code-driven.

Related Issue(s)

Addresses the need for simplified custom detector integration without deployment overhead.

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable (will add README in PR comments).
I've added tests if applicable.
@RobGeada, @m-misiura

…lity/GHAction CI: Add publishing action for server image

Signed-off-by: Rob Geada <[email protected]>

…lity/nemo-server added a Containerfile for the nemo server

…ovider-list`

…lity/containerfile-server Ensure providers can be switched on/off at build time

- Add generic KServe detector actions supporting any model format - Support sequence classification, token classification, and binary detectors - Enable parallel detector execution via dynamic registry - Add KServeDetectorConfig to rails config schema - Configuration-driven detector management (no code rebuilds required)

RobGeada · 2025-10-07T20:17:53Z

nemoguardrails/rails/llm/config.py

+        default=30,
+        description="HTTP request timeout in seconds"
+    )
+    detector_type: str = Field(


Can we remove the detector_type and risk_type fields, and instead just use the key of the kserve_detectors dict instead?

Removed detector_type field - detector name now comes from dictionary key. Kept risk_type as optional since it's needed to distinguish system errors from content violations and allows semantic risk categorization.

RobGeada · 2025-10-07T20:26:16Z

docs/user-guides/kserve-detector-integration.md

+  - `timeout`: HTTP timeout in seconds
+  - `detector_type`: Detector identifier
+  - `risk_type`: Risk classification type
+  - `invert_logic`: Score inversion for reversed semantics


I think we'll need some more complex/more flexible logic for defining what is and is not a flagged response.

So, for example. we could have:

Regressors <- either 0 or 1 could be the safe direction

Boolean Classifiers <- either True or False could be safe

Multi-class classifiers

No predicted classes -> no detection, some predicted classes -> detection

Class $SAFE is predicted -> no detection, any class $NOTSAFE -> detection

So I think we need to have some way of exposing this to the user. You can check out how we handled it in the existing guardrails detectors framework: https://github.com/trustyai-explainability/guardrails-detectors/blob/main/detectors/huggingface/detector.py#L228

I agree with @RobGeada; we should also avoid hardcoding safe labels like it is the case now

Implemented safe_labels approach. Checks both integer indices and string labels, supports all classifier types (binary, multi-class, regressors). All hardcoded assumptions removed.

m-misiura · 2025-10-08T09:25:04Z

docs/user-guides/kserve-detector-integration.md

+    model:
+      modelFormat:
+        name: huggingface
+      image: kserve/huggingfaceserver:v0.13.0


is this a Docker image? if so, is there any chance of using an image from quay?

Changed everything to use image from Quay
Mirrored to quay.io/rh-ee-stondapu/huggingfaceserver:v0.14.0. All detectors now use Quay to avoid Docker Hub rate limits. Also upgraded from v0.13.0 to v0.14.0.

m-misiura · 2025-10-08T09:36:17Z

docs/user-guides/kserve-detector-integration.md

+Deploy the three KServe detectors:
+
+```bash
+oc apply -f toxicity-detector.yaml


I think this configuration file depend on some kind of servingruntime which is missing; when following the docs as described I get No runtime found to support specified framework/version Reason: NoSupportingRuntime

This configuration might be of some use

--- apiVersion: serving.kserve.io/v1alpha1 kind: ServingRuntime metadata: name: huggingface-hap-runtime annotations: openshift.io/display-name: HuggingFace HAP ServingRuntime for KServe opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]' labels: opendatahub.io/dashboard: 'true' spec: multiModel: false supportedModelFormats: - autoSelect: true name: huggingface containers: - name: kserve-container image: quay.io/rh-ee-mmisiura/kserve/huggingfaceserver:latest-gpu args: - --model_name=hap - --model_id=ibm-granite/granite-guardian-hap-38m - --return_probabilities - --backend=huggingface ports: - containerPort: 8080 protocol: TCP --- apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: name: huggingface-hap-detector annotations: security.opendatahub.io/enable-auth: 'true' spec: predictor: model: modelFormat: name: huggingface runtime: huggingface-hap-runtime resources: limits: cpu: '1' memory: 2Gi nvidia.com/gpu: '1' requests: cpu: '1' memory: 2Gi nvidia.com/gpu: '1' ---

Added HuggingFace ServingRuntime deployment as Step 1 in documentation. This will resolve this issue when trying to replicate the results

m-misiura · 2025-10-08T09:49:27Z

docs/user-guides/kserve-detector-integration.md

+        configMap:
+          name: nemo-production-config
+```
+### vLLM Deployment (LLM Inference)


why is the generation model served using the deployment instead of serving runtime + isvc pattern?

I was having problems with latest versions of vllm which led me to use Deployment. Now, I have migrated to InferenceService using Red Hat's official image (registry.redhat.io/rhaiis/vllm-cuda-rhel9:3) with PVC-based model storage. Addresses both Docker Hub and consistency concerns.

m-misiura · 2025-10-08T10:15:48Z

docs/user-guides/kserve-detector-integration.md

+        node.kubernetes.io/instance-type: g4dn.2xlarge
+      containers:
+      - name: vllm
+        image: vllm/vllm-openai:v0.4.2


it would be good to replace it with an image not from a docker registry; in addition this is an old image and it would be worthwhile upgrading to a newer vllm version

using latest Red Hat's official image (registry.redhat.io/rhaiis/vllm-cuda-rhel9:3)

m-misiura · 2025-10-08T10:22:34Z

docs/user-guides/kserve-detector-integration.md

+          cpu: "4"
+          memory: "8Gi"
+```
+### NeMo Server Deployment


I think Nemo Server deployment is missing a service definition, so oc expose service nemo-guardrails-server -n kserve-hfdetector will not work later on

Added Service definition to nemo-deployment.yml. Now deploys Deployment + Service together.

m-misiura · 2025-10-08T10:25:10Z

docs/user-guides/kserve-detector-integration.md

+Expose the service externally:
+
+```bash
+oc expose service nemo-guardrails-server -n kserve-hfdetector


this will only work if there is a Service definition in the nemo-deployment.yaml

Added Service definition to nemo-deployment.yml. Now deploys Deployment + Service together.

m-misiura · 2025-10-08T10:27:50Z

docs/user-guides/kserve-detector-integration.md

+
+## Testing
+
+Replace YOUR_ROUTE with your NeMo Guardrails route URL.


consider adding this command to get the route
YOUR_ROUTE="http://$(oc get routes nemo-guardrails-server -o jsonpath='{.spec.host}')"

Added route extraction command and updated all curl examples to use $YOUR_ROUTE variable for easy testing.

m-misiura · 2025-10-08T10:29:58Z

docs/user-guides/kserve-detector-integration.md

+### Test 1: Safe Content (Should Pass)
+
+```bash
+curl -X POST http://YOUR_ROUTE/v1/chat/completions \


and if you were to store the route in the variable, you could do:
curl -X POST $YOUR_ROUTE/v1/chat/completions \

Added route extraction command and updated all curl examples to use $YOUR_ROUTE variable for easy testing.

m-misiura · 2025-10-08T10:45:00Z

nemoguardrails/library/kserve_detector/actions.py

+                if len(unique_labels) == 1:
+                    return True, 0.0, "SAFE"
+
+                background_labels = {0, max(unique_labels)} if max(unique_labels) > 10 else {0}


I think this could be fragile; consider having this specification explicit

Removed the fragile background_labels guessing logic entirely. All safe classes are now explicitly configured via the safe_labels field in the detector configuration.

m-misiura · 2025-10-08T10:46:05Z

nemoguardrails/library/kserve_detector/actions.py

+        if isinstance(prediction, dict):
+            score = prediction.get("score", 0.0)
+            label = prediction.get("label", "UNKNOWN")
+            is_safe = score < 0.5 or label.lower() in ["safe", "non_toxic", "label_0"]


this is unlikely to work with models using different label names

Removed hardcoded label name checks. Now uses the configurable safe_labels list which supports any label naming convention (both string labels and integer class IDs).

m-misiura · 2025-10-08T10:47:12Z

nemoguardrails/library/kserve_detector/actions.py

+            response_data, threshold, detector_type, risk_type, invert_logic
+        )
+
+    except Exception as e:


would this treat e.g. network timeouts the same as content violations? if so, this might need to be reconsidered

System errors now use risk_type: 'system_error', score: 0.0, and separate unavailable_detectors tracking. Clearly distinguished from content violations in both code and user messages.

m-misiura · 2025-10-08T11:02:35Z

nemoguardrails/library/kserve_detector/actions.py

+                return True, 0.0, "SAFE"
+
+            # Structured entity dicts
+            if isinstance(prediction[0], dict):


I have concerns this might not deal with the kserve probability distributions in the token classification case
For example, this is an output from the pii detector

curl -X POST http://localhost:8080/v1/models/pii-detector:predict \ -H "Content-Type: application/json" \ -d '{"instances": ["My SSN is 123-45-6789 and my email is [email protected]"]}' > > {"predictions":[[{"0":0.0012,"1":0.0001,"2":0.0001,"3":0.0006,"4":0.0002,"5":0.0003,"6":0.0002,"7":0.0004,"8":0.0007,"9":0.0001,"10":0.9858,"11":0.0001,"12":0.0003,"13":0.0072,"14":0.0004,"15":0.0003,"16":0.0002,"17":0.002},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9995,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9996,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9996,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9995,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0001,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0003,"1":0.0007,"2":0.0001,"3":0.0002,"4":0.0001,"5":0.0001,"6":0.0005,"7":0.0005,"8":0.0001,"9":0.0001,"10":0.0004,"11":0.0001,"12":0.0004,"13":0.0001,"14":0.0004,"15":0.0056,"16":0.0002,"17":0.9902}]]}$

same concern night be the case in the context of sequence classification

curl -X POST http://localhost:8080/v1/models/jailbreak-detector:predict \ -H "Content-Type: application/json" \ -d '{"instances": ["How do I bypass security controls?"]}' > > {"predictions":[{"0":0.994,"1":0.006}]}

Added full probability distribution support for both token and sequence classification. The parser now handles per-token probability dicts (finds max class per token) and sequence probability dicts (checks each class against safe_labels with threshold). Also fixed nested list unwrapping for token classification responses.

m-misiura

Before merging:

consider tidying up openshift manifests based on my comments (there are missing resource definitions)
action.py would benefit from being more aligned with the output of kserve predictions and be more flexible in terms of what is a safe label and what is an unsafe label

…ssues - Implemented configurable safe_labels for flexible detection logic - Removed hardcoded assumptions (detector_type, invert_logic, background labels) - Distinguished system errors from content violations (risk_type: system_error) - Updated config schema and removed redundant fields - Added Service definition to NeMo deployment - Migrated to Quay images to avoid Docker Hub rate limiting - Updated documentation with complete deployment guide

m-misiura · 2025-10-14T08:33:13Z

docs/user-guides/kserve-detector-integration.md

+
+**Components:**
+- **NeMo Guardrails** (CPU) - Orchestration and flow control
+- **KServe Detectors** (CPU) - Toxicity, jailbreak, PII, HAP detection


based on this, a user might have an impression that this can only work with Toxicity, jailbreak, PII, HAP; it would be prudent to say that this is for demo purposes only

Fixed - clarified that the listed detectors are examples only. Updated text to indicate the integration works with any HuggingFace sequence or token classification model, not just the specific detectors shown in the guide.

m-misiura

Thanks for addressing some of the earlier feedback. I think it would be a good idea to:

consider the KServe API and decide what exactly should be supported (in my view, if a KServe predictor is not setup to return labels and their scores, they should probably not be supported as it won't be able to possible to readily support e.g. custom threshold

My recommended action would be to review API contracts and make changes to the code in accordance to this. Useful questions to consider here would be

do I want to support both v1 and v2 inference protocols?
if not, why not?
what is the actual request payload and respond payload that needs to be supported?

At some stage you might want to consider other APIs, such as Detectors API but at first focus only one of them. If you would like to focus on Detector API instead of the KServe API, that's fine, but it would require a greater overhaul

based on step 1 -- tidy up actions.py as it seems to be overgeneralised in some cases and potentially contains redundant code; making it simpler might also make it more maintainable in the future
address typos in the docs and make them more generalisable (i.e. remove namespace references from the manifests, fix typo(s) in the ServingRuntime configuration and some unnecessary image refrences in the ISVCs

… API contract - Require --return_probabilities flag for all detectors (enable threshold-based filtering) - Add softmax transformation to handle both logits and probabilities - Restrict to KServe V1 protocol only (remove V2 support) - Support only 2 response formats: sequence classification and token classification dicts - Remove 5 unsupported format handlers (integers, entities, single values, booleans, named labels) - Change safe_labels type from List[Union[int, str]] to List[int] - Add --backend=huggingface flag to ServingRuntime - Standardize all detector YAMLs to use args format - Remove unnecessary image references from InferenceServices - Add API contract documentation explaining V1-only decision - Update detection flow to clarify HTTP/HTTPS support - Remove namespace hardcoding from all examples

m-misiura · 2025-10-16T09:41:52Z

nemoguardrails/library/kserve_detector/actions.py

+        if isinstance(prediction, dict) and all(
+            str(k).isdigit() or isinstance(k, int) for k in prediction.keys()
+        ):
+            # Convert logits to probabilities if needed


I am not sure if this is really needed since when --return_probabilities is set, KServe already applies softmax before returning the response; see e.g. this

Subsequently, it should be fine to remove the softmax function entirely (L21-26) and check on (L72-75)

Re: softmax - Keeping it for now. KServe v0.14/v0.15 return logits despite --return_probabilities flag (bug fixed in unreleased v0.16). The softmax detection logic handles both logits and probabilities defensively, ensuring compatibility across KServe versions. Will remove once v0.16.0 stable is released.

m-misiura · 2025-10-16T09:49:54Z

nemoguardrails/library/kserve_detector/actions.py

+
+        # Sequence classification - probability/logit distributions
+        # Format: {"0": 0.994, "1": 0.006} or {"0": 1.12, "1": -1.53}
+        if isinstance(prediction, dict) and all(


is this check actually useful? after JSON deserialisation, keys will always be strings?

Fixed - removed redundant isinstance(k, int) check. JSON deserialization always produces string keys."

m-misiura · 2025-10-16T09:51:09Z

nemoguardrails/library/kserve_detector/actions.py

+            detected_classes = []
+
+            for class_id_key, prob in prediction.items():
+                class_id = int(class_id_key) if isinstance(class_id_key, str) else class_id_key


if keys are always strings, then the else is redundant

"Fixed - simplified to class_id = int(class_id_key) since JSON keys are always strings. Removed redundant else clause."

m-misiura · 2025-10-16T10:16:05Z

nemoguardrails/library/kserve_detector/actions.py

+    if isinstance(user_message, dict):
+        user_message = user_message.get("content", "")
+
+    if not user_message.strip():


does this mean empty messages are fast-forwarded to the LLM? if so, what is the justification for not passing such messages through the detectors?

Fixed - removed empty message bypass. Empty messages now go through the full detector pipeline before reaching the LLM.

m-misiura · 2025-10-16T10:19:12Z

nemoguardrails/library/kserve_detector/actions.py

+        config = context.get("config")
+
+    if not config:
+        return {"allowed": False, "reason": "No configuration"}


this uses reason, but e.g. on L359, it uses "error" -- I think there could be an API inconsistency, which should be standardised

Fixed - standardized all error returns to use "reason" field for API consistency.

m-misiura · 2025-10-16T10:21:15Z

nemoguardrails/library/kserve_detector/actions.py

+    content_blocks = []
+    allowing = []
+
+    for i, result in enumerate(results):


what would be the behaviour of this function in case the ordering assumption breaks (i.e. dict keys iteration no longer matches execution order)?

Fixed - replaced index-based matching with explicit tuples (detector_name, task). Detector names are now retrieved from the tuple rather than assuming dict key order matches result order, eliminating the fragile ordering dependency.

m-misiura · 2025-10-16T10:26:01Z

nemoguardrails/library/kserve_detector/actions.py

@@ -0,0 +1,432 @@
+"""


Generally, I think this would benefit from adding Pydantic response models to outline the API contract; this is not currently the case as the file uses untyped dicts. Using Pydantic response models would allow to better discover the API and find out e.g. what fields are required and what fields are optional

additionally, you may want to potentially consider adding some kind of Pydantic validation to better tackle handling of e.g. empty messages like on L239

Fixed - added Pydantic models (DetectorResult and AggregatedDetectorResult) with explicit Field definitions to make the API contract discoverable. All fields are clearly marked as required or optional with defaults.
Re: Pydantic validation for empty messages - didn't add this as we're now passing empty messages through the detector pipeline (as discussed), rather than validating/blocking them at the input level.

m-misiura · 2025-10-16T10:43:40Z

nemoguardrails/library/kserve_detector/actions.py

+        }
+
+
+async def _call_kserve_endpoint(endpoint: str, text: str, timeout: int) -> Dict[str, Any]:


does this create async session for every request? is this correct?

Fixed - now using a shared aiohttp.ClientSession created at module load and reused across all detector calls. This eliminates per-request session overhead while maintaining per-request timeout flexibility.

m-misiura · 2025-10-16T10:44:46Z

nemoguardrails/library/kserve_detector/actions.py

+    if not kserve_detectors:
+        return {"allowed": True, "reason": "No detectors configured"}
+
+    log.info(f"Running {len(kserve_detectors)} detectors: {list(kserve_detectors.keys())}")


how does synchronous logging work in the async context?

Re: sync logging - Since Python's logging module uses buffered I/O and given our logging frequency is low (per user message), the event loop impact is negligible. I have used standard logging to stay consistent with NeMo Guardrails' own async action implementations.

Core Improvements: - Add Pydantic response models for clear API contracts and type validation - Implement shared aiohttp session to eliminate per-request overhead - Remove empty message bypass to ensure all input goes through detector pipeline Code Quality Fixes: - Standardize all error responses to use 'reason' field - Fix fragile dict ordering with explicit task-name tupling - Remove redundant JSON key type checks (keys are always strings post-deserialization) - Simplify class ID conversions by removing unreachable branches Note: Softmax logic retained for KServe v0.14/v0.15 compatibility (bug fixed in v0.16)

m-misiura · 2025-10-21T09:44:09Z

nemoguardrails/library/kserve_detector/actions.py

+        return f"Input blocked by {det['detector']} detector (risk: {det['risk_type']}, score: {det['score']:.2f})"
+
+    # Multiple detectors blocked
+    detector_names = [d['detector'] for d in blocking]


dict access might be unsafe as you might lose Pydantic validation?

Fixed. Dict access is intentional and safe. We use Pydantic models internally for validation during object creation (DetectorResult(...)), then convert to plain dicts via .dict() because NeMo's Colang runtime requires plain data structures and cannot access Pydantic object attributes. Validation occurs before conversion, so no safety is lost. This pattern matches other NeMo actions.

m-misiura · 2025-10-21T09:55:12Z

nemoguardrails/library/kserve_detector/actions.py

+    config: Optional[Any] = None,
+    detector_type: str = "toxicity",
+    **kwargs
+) -> Dict[str, Any]:


are these type hints reflective of the Pydantic data models?

Fixed. Type hints are accurate. Functions return Dict[str, Any] which correctly reflects the return value after .dict() conversion. We create Pydantic objects internally for validation, then convert to plain dicts for Colang compatibility. The type hints truthfully describe what gets returned (dict), not the internal Pydantic models.

m-misiura · 2025-10-21T10:02:05Z

nemoguardrails/library/kserve_detector/actions.py

+DEFAULT_TIMEOUT = 30
+
+# Shared HTTP session for all detector calls
+_http_session = aiohttp.ClientSession()


wouldn't this be considered a resource leak as the session would be created at module import time? additionally, this connection seems to remain open indefinitely?

Fixed. Changed from module-level _http_session = aiohttp.ClientSession() to lazy initialization with asyncio lock inside _call_kserve_endpoint() to avoid import-time side effects and resource leaks.

m-misiura · 2025-10-21T10:16:32Z

nemoguardrails/library/kserve_detector/actions.py

@@ -0,0 +1,444 @@
+"""
+Generic KServe Detector Integration for NeMo Guardrails


this name is not reflective of the fact that this is only compatible with a specific type of huggingface server -- this should be reflected throughout out the code

Fixed. Updated file docstring and function comments from "Generic KServe" to "KServe HuggingFace Detector Integration" to reflect that this integration requires KServe HuggingFace runtime with --return_probabilities flag.

m-misiura · 2025-10-21T12:11:12Z

docs/user-guides/kserve-detector-integration.md

+kind: PersistentVolumeClaim
+metadata:
+  name: phi3-model-pvc
+  namespace: <your-namespace>


I would remove these references altogether

Fixed. Removed all namespace: references from YAML examples in documentation.

m-misiura · 2025-10-21T12:12:01Z

docs/user-guides/kserve-detector-integration.md

+          value: production
+        - name: OPENAI_API_KEY
+          value: sk-dummy-key
+        - name: SAFE_LABELS


why are safe labels specified in both configmap and deployment? it would be much better to have it only in the configmap

Fixed. Removed SAFE_LABELS environment variable from Deployment YAML. Safe labels are now configured only in ConfigMap per-detector settings.

m-misiura · 2025-10-21T12:29:47Z

nemoguardrails/rails/llm/config.py

+        default=30,
+        description="HTTP request timeout in seconds"
+    )
+    risk_type: Optional[str] = Field(


I am not sure how valuable tihs field is and perhaps to simplify the config, it could be removed

Fixed. Removed risk_type field from both KServeDetectorConfig and DetectorResult. The field was redundant since detector name already identifies the risk category, and system errors are distinguished by label="ERROR".

m-misiura · 2025-10-21T12:42:41Z

docs/user-guides/kserve-detector-integration.md

+            claimName: phi3-model-pvc
+---
+apiVersion: serving.kserve.io/v1beta1
+kind: InferenceService


I think it would be useful to show how to configure this with authenticated services too, here is a potential example

apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: name: hap-detector annotations: serving.knative.openshift.io/enablePassthrough: "true" sidecar.istio.io/inject: "true" sidecar.istio.io/rewriteAppHTTPProbers: "true" serving.kserve.io/deploymentMode: RawDeployment security.opendatahub.io/enable-auth: "true" spec: predictor: minReplicas: 1 maxReplicas: 2 model: modelFormat: name: huggingface args: - --model_name=hap-detector - --model_id=ibm-granite/granite-guardian-hap-38m - --task=sequence_classification resources: requests: cpu: "1" memory: "2Gi" limits: cpu: "2" memory: "4Gi"

Fixed. Added Authentication section to documentation with InferenceService auth annotations example and NeMo token configuration. Also added auth annotations to all detector deployment YAMLs.

m-misiura · 2025-10-21T12:43:18Z

docs/user-guides/kserve-detector-integration.md

+        persistentVolumeClaim:
+          claimName: phi3-model-pvc
+    nodeSelector:
+      node.kubernetes.io/instance-type: g4dn.2xlarge


I would remove any instances to node instance-type in these manifests

Fixed. Removed all nodeSelector blocks referencing node instance types from documentation YAML examples.

m-misiura · 2025-10-21T12:44:08Z

docs/user-guides/kserve-detector-integration.md

+      - name: nemo-guardrails
+        image: quay.io/rh-ee-stondapu/trustyai-nemo:latest
+        imagePullPolicy: Always
+        env:


it would be useful to also show how to add authentication to e.g. detectors in here

Fixed. Added KSERVE_API_KEY environment variable example to NeMo deployment documentation and showed per-detector api_key configuration in ConfigMap examples.

m-misiura · 2025-10-21T12:49:35Z

nemoguardrails/library/kserve_detector/actions.py

+
+    api_key = os.getenv("KSERVE_API_KEY")
+    if api_key:
+        headers["Authorization"] = f"Bearer {api_key}"


should detector endpoints only accept the same bearer token for authentication?

Fixed. Added api_key: Optional[str] field to KServeDetectorConfig for per-detector authentication. Updated _call_kserve_endpoint() to use detector-specific token with fallback to global KSERVE_API_KEY env var. Supports per-detector, global, and unauthenticated patterns.

m-misiura

This is getting closer!

Next, if you can, please focus on:

updating the user guide to demonstrate how to set this up with authenticated services if possible as well removing references to namespaces and node instance-types in the manifests
addressing comments pertaining actions.py (e.g. inconsistent type hints, shared https session resource leak) and config.py (e.g. the necessity of certain fields)
adding some unit and perhaps intergation tests

- Removed risk_type field (redundant with detector name) - Implemented per-detector authentication with global fallback - Added lazy session initialization with lock (fix resource leak) - Fixed naming to reflect HuggingFace-only compatibility - Added authentication documentation with ODH example - Removed namespace and node-type references from docs - Added 19 unit/integration tests (all passing) - Added auth annotations to detector deployment YAMLs

m-misiura and others added 12 commits July 23, 2025 14:27

🚧 added a Containerfile for the nemo server

f6c09be

📝 updated .gitignore

8fb24c5

Add publishing action for server image

805f3d7

Merge pull request trustyai-explainability#2 from trustyai-explainabi…

1c55c40

…lity/GHAction CI: Add publishing action for server image

Use Dockerfile.server for server push action

3e7ae64

Signed-off-by: Rob Geada <[email protected]>

Merge pull request trustyai-explainability#5 from trustyai-explainabi…

60c885e

…lity/nemo-server added a Containerfile for the nemo server

Merge branch 'NVIDIA:develop' into develop

c568b80

Merge branch 'NVIDIA:develop' into develop

35c3dba

🚧 containerfile has an option to remove some providers based on a `pr…

1cf58f9

…ovider-list`

Merge pull request trustyai-explainability#7 from trustyai-explainabi…

1c6e7a9

…lity/containerfile-server Ensure providers can be switched on/off at build time

Add KServe detector integration documentation

46f6a17

RobGeada requested changes Oct 7, 2025

View reviewed changes

m-misiura reviewed Oct 8, 2025

View reviewed changes

m-misiura requested changes Oct 8, 2025

View reviewed changes

srikartondapu requested review from RobGeada and m-misiura October 13, 2025 21:52

m-misiura reviewed Oct 14, 2025

View reviewed changes

m-misiura requested changes Oct 14, 2025

View reviewed changes

srikartondapu requested a review from m-misiura October 16, 2025 06:32

m-misiura reviewed Oct 16, 2025

View reviewed changes

srikartondapu requested a review from m-misiura October 20, 2025 11:55

m-misiura reviewed Oct 21, 2025

View reviewed changes

m-misiura requested changes Oct 21, 2025

View reviewed changes

srikartondapu requested a review from m-misiura November 3, 2025 06:17

m-misiura force-pushed the develop branch 2 times, most recently from b7d6750 to 603bf72 Compare November 10, 2025 16:54


		## Testing

		Replace YOUR_ROUTE with your NeMo Guardrails route URL.

		}


		async def _call_kserve_endpoint(endpoint: str, text: str, timeout: int) -> Dict[str, Any]:

		@@ -0,0 +1,444 @@
		"""
		Generic KServe Detector Integration for NeMo Guardrails

Generic KServe detector integration #9

Are you sure you want to change the base?

Generic KServe detector integration #9

Conversation

srikartondapu commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Description

Related Issue(s)

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-misiura Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srikartondapu Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-misiura Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

m-misiura Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srikartondapu Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

srikartondapu commented Oct 7, 2025 •

edited

Loading

m-misiura Oct 8, 2025 •

edited

Loading

srikartondapu Oct 13, 2025 •

edited

Loading

m-misiura Oct 8, 2025 •

edited

Loading

m-misiura Oct 8, 2025 •

edited

Loading

srikartondapu Oct 13, 2025 •

edited

Loading

m-misiura Oct 14, 2025 •

edited

Loading

m-misiura left a comment •

edited

Loading

m-misiura Oct 16, 2025 •

edited

Loading

m-misiura Oct 16, 2025 •

edited

Loading