Skip to content

Conversation

@srikartondapu
Copy link

@srikartondapu srikartondapu commented Oct 7, 2025

Summary

  • Add generic KServe detector actions supporting any model format
  • Support sequence classification, token classification, and binary detectors
  • Enable parallel detector execution via dynamic registry
  • Add KServeDetectorConfig to rails config schema
  • Configuration-driven detector management (no code rebuilds required)

Description

This PR introduces a generic integration framework for KServe-hosted detection models in NeMo Guardrails. The implementation enables teams to add, remove, or configure safety detectors purely through ConfigMap updates - eliminating the need for code changes or container rebuilds.

Key capabilities:

  • Format-agnostic parsing: Automatically handles binary classification, sequence classification, and token classification responses
  • Parallel execution: All configured detectors run simultaneously for minimal latency
  • Dynamic registry: Detectors are discovered from the kserve_detectors configuration at runtime
  • Production-validated: Tested on OpenShift with Granite Guardian (toxicity), jailbreak classifier, and Piiranha (PII detection)

Implementation:

  • Added nemoguardrails/library/kserve_detector/actions.py with generic detector actions
  • Modified nemoguardrails/rails/llm/config.py to support KServeDetectorConfig class and detector registry

This addresses the operational overhead of deploying custom detectors by making detector management configuration-driven rather than code-driven.

Related Issue(s)

Addresses the need for simplified custom detector integration without deployment overhead.

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable (will add README in PR comments).
  • I've added tests if applicable.
  • @RobGeada, @m-misiura

m-misiura and others added 12 commits July 23, 2025 14:27
…lity/GHAction

CI: Add publishing action for server image
…lity/nemo-server

added a Containerfile for the nemo server
…lity/containerfile-server

Ensure providers can be switched on/off at build time
- Add generic KServe detector actions supporting any model format
- Support sequence classification, token classification, and binary detectors
- Enable parallel detector execution via dynamic registry
- Add KServeDetectorConfig to rails config schema
- Configuration-driven detector management (no code rebuilds required)
default=30,
description="HTTP request timeout in seconds"
)
detector_type: str = Field(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the detector_type and risk_type fields, and instead just use the key of the kserve_detectors dict instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed detector_type field - detector name now comes from dictionary key. Kept risk_type as optional since it's needed to distinguish system errors from content violations and allows semantic risk categorization.

- `timeout`: HTTP timeout in seconds
- `detector_type`: Detector identifier
- `risk_type`: Risk classification type
- `invert_logic`: Score inversion for reversed semantics
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need some more complex/more flexible logic for defining what is and is not a flagged response.

So, for example. we could have:

  1. Regressors <- either 0 or 1 could be the safe direction
  2. Boolean Classifiers <- either True or False could be safe
  3. Multi-class classifiers
    • No predicted classes -> no detection, some predicted classes -> detection
    • Class $SAFE is predicted -> no detection, any class $NOTSAFE -> detection

So I think we need to have some way of exposing this to the user. You can check out how we handled it in the existing guardrails detectors framework: https://github.com/trustyai-explainability/guardrails-detectors/blob/main/detectors/huggingface/detector.py#L228

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @RobGeada; we should also avoid hardcoding safe labels like it is the case now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented safe_labels approach. Checks both integer indices and string labels, supports all classifier types (binary, multi-class, regressors). All hardcoded assumptions removed.

model:
modelFormat:
name: huggingface
image: kserve/huggingfaceserver:v0.13.0
Copy link
Collaborator

@m-misiura m-misiura Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a Docker image? if so, is there any chance of using an image from quay?

Copy link
Author

@srikartondapu srikartondapu Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed everything to use image from Quay
Mirrored to quay.io/rh-ee-stondapu/huggingfaceserver:v0.14.0. All detectors now use Quay to avoid Docker Hub rate limits. Also upgraded from v0.13.0 to v0.14.0.

Deploy the three KServe detectors:

```bash
oc apply -f toxicity-detector.yaml
Copy link
Collaborator

@m-misiura m-misiura Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this configuration file depend on some kind of servingruntime which is missing; when following the docs as described I get No runtime found to support specified framework/version Reason: NoSupportingRuntime

This configuration might be of some use

---
apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
  name: huggingface-hap-runtime
  annotations:
    openshift.io/display-name: HuggingFace HAP ServingRuntime for KServe
    opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]'
  labels:
    opendatahub.io/dashboard: 'true'
spec:
  multiModel: false
  supportedModelFormats:
    - autoSelect: true
      name: huggingface
  containers:
    - name: kserve-container
      image: quay.io/rh-ee-mmisiura/kserve/huggingfaceserver:latest-gpu 
      args:
        - --model_name=hap
        - --model_id=ibm-granite/granite-guardian-hap-38m
        - --return_probabilities
        - --backend=huggingface
      ports:
        - containerPort: 8080
          protocol: TCP
---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: huggingface-hap-detector
  annotations:
    security.opendatahub.io/enable-auth: 'true'
spec:
  predictor:
    model:
      modelFormat:
        name: huggingface
      runtime: huggingface-hap-runtime
      resources:
        limits:
          cpu: '1'
          memory: 2Gi
          nvidia.com/gpu: '1'
        requests:
          cpu: '1'
          memory: 2Gi
          nvidia.com/gpu: '1'
---

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added HuggingFace ServingRuntime deployment as Step 1 in documentation. This will resolve this issue when trying to replicate the results

configMap:
name: nemo-production-config
```
### vLLM Deployment (LLM Inference)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the generation model served using the deployment instead of serving runtime + isvc pattern?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having problems with latest versions of vllm which led me to use Deployment. Now, I have migrated to InferenceService using Red Hat's official image (registry.redhat.io/rhaiis/vllm-cuda-rhel9:3) with PVC-based model storage. Addresses both Docker Hub and consistency concerns.

node.kubernetes.io/instance-type: g4dn.2xlarge
containers:
- name: vllm
image: vllm/vllm-openai:v0.4.2
Copy link
Collaborator

@m-misiura m-misiura Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to replace it with an image not from a docker registry; in addition this is an old image and it would be worthwhile upgrading to a newer vllm version

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using latest Red Hat's official image (registry.redhat.io/rhaiis/vllm-cuda-rhel9:3)

cpu: "4"
memory: "8Gi"
```
### NeMo Server Deployment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Nemo Server deployment is missing a service definition, so oc expose service nemo-guardrails-server -n kserve-hfdetector will not work later on

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Service definition to nemo-deployment.yml. Now deploys Deployment + Service together.

Expose the service externally:

```bash
oc expose service nemo-guardrails-server -n kserve-hfdetector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will only work if there is a Service definition in the nemo-deployment.yaml

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Service definition to nemo-deployment.yml. Now deploys Deployment + Service together.


## Testing

Replace YOUR_ROUTE with your NeMo Guardrails route URL.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding this command to get the route
YOUR_ROUTE="http://$(oc get routes nemo-guardrails-server -o jsonpath='{.spec.host}')"

Copy link
Author

@srikartondapu srikartondapu Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added route extraction command and updated all curl examples to use $YOUR_ROUTE variable for easy testing.

### Test 1: Safe Content (Should Pass)

```bash
curl -X POST http://YOUR_ROUTE/v1/chat/completions \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if you were to store the route in the variable, you could do:
curl -X POST $YOUR_ROUTE/v1/chat/completions \

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added route extraction command and updated all curl examples to use $YOUR_ROUTE variable for easy testing.

if len(unique_labels) == 1:
return True, 0.0, "SAFE"

background_labels = {0, max(unique_labels)} if max(unique_labels) > 10 else {0}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be fragile; consider having this specification explicit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fragile background_labels guessing logic entirely. All safe classes are now explicitly configured via the safe_labels field in the detector configuration.

if isinstance(prediction, dict):
score = prediction.get("score", 0.0)
label = prediction.get("label", "UNKNOWN")
is_safe = score < 0.5 or label.lower() in ["safe", "non_toxic", "label_0"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unlikely to work with models using different label names

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed hardcoded label name checks. Now uses the configurable safe_labels list which supports any label naming convention (both string labels and integer class IDs).

response_data, threshold, detector_type, risk_type, invert_logic
)

except Exception as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this treat e.g. network timeouts the same as content violations? if so, this might need to be reconsidered

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System errors now use risk_type: 'system_error', score: 0.0, and separate unavailable_detectors tracking. Clearly distinguished from content violations in both code and user messages.

return True, 0.0, "SAFE"

# Structured entity dicts
if isinstance(prediction[0], dict):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns this might not deal with the kserve probability distributions in the token classification case
For example, this is an output from the pii detector

curl -X POST http://localhost:8080/v1/models/pii-detector:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": ["My SSN is 123-45-6789 and my email is [email protected]"]}'
> > {"predictions":[[{"0":0.0012,"1":0.0001,"2":0.0001,"3":0.0006,"4":0.0002,"5":0.0003,"6":0.0002,"7":0.0004,"8":0.0007,"9":0.0001,"10":0.9858,"11":0.0001,"12":0.0003,"13":0.0072,"14":0.0004,"15":0.0003,"16":0.0002,"17":0.002},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9995,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9996,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9996,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.9995,"11":0.0,"12":0.0,"13":0.0003,"14":0.0,"15":0.0,"16":0.0,"17":0.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0001,"16":0.0,"17":0.9999},{"0":0.0,"1":0.0,"2":0.0,"3":0.0,"4":0.0,"5":0.0,"6":0.0,"7":0.0,"8":0.0,"9":0.0,"10":0.0,"11":0.0,"12":0.0,"13":0.0,"14":0.0,"15":0.0,"16":0.0,"17":1.0},{"0":0.0003,"1":0.0007,"2":0.0001,"3":0.0002,"4":0.0001,"5":0.0001,"6":0.0005,"7":0.0005,"8":0.0001,"9":0.0001,"10":0.0004,"11":0.0001,"12":0.0004,"13":0.0001,"14":0.0004,"15":0.0056,"16":0.0002,"17":0.9902}]]}$     

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same concern night be the case in the context of sequence classification

curl -X POST http://localhost:8080/v1/models/jailbreak-detector:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": ["How do I bypass security controls?"]}'
> > {"predictions":[{"0":0.994,"1":0.006}]}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added full probability distribution support for both token and sequence classification. The parser now handles per-token probability dicts (finds max class per token) and sequence probability dicts (checks each class against safe_labels with threshold). Also fixed nested list unwrapping for token classification responses.

Copy link
Collaborator

@m-misiura m-misiura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging:

  1. consider tidying up openshift manifests based on my comments (there are missing resource definitions)
  2. action.py would benefit from being more aligned with the output of kserve predictions and be more flexible in terms of what is a safe label and what is an unsafe label

…ssues

- Implemented configurable safe_labels for flexible detection logic

- Removed hardcoded assumptions (detector_type, invert_logic, background labels)

- Distinguished system errors from content violations (risk_type: system_error)

- Updated config schema and removed redundant fields

- Added Service definition to NeMo deployment

- Migrated to Quay images to avoid Docker Hub rate limiting

- Updated documentation with complete deployment guide

**Components:**
- **NeMo Guardrails** (CPU) - Orchestration and flow control
- **KServe Detectors** (CPU) - Toxicity, jailbreak, PII, HAP detection
Copy link
Collaborator

@m-misiura m-misiura Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on this, a user might have an impression that this can only work with Toxicity, jailbreak, PII, HAP; it would be prudent to say that this is for demo purposes only

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - clarified that the listed detectors are examples only. Updated text to indicate the integration works with any HuggingFace sequence or token classification model, not just the specific detectors shown in the guide.

Copy link
Collaborator

@m-misiura m-misiura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing some of the earlier feedback. I think it would be a good idea to:

  1. consider the KServe API and decide what exactly should be supported (in my view, if a KServe predictor is not setup to return labels and their scores, they should probably not be supported as it won't be able to possible to readily support e.g. custom threshold

My recommended action would be to review API contracts and make changes to the code in accordance to this. Useful questions to consider here would be

  • do I want to support both v1 and v2 inference protocols?
  • if not, why not?
  • what is the actual request payload and respond payload that needs to be supported?

At some stage you might want to consider other APIs, such as Detectors API but at first focus only one of them. If you would like to focus on Detector API instead of the KServe API, that's fine, but it would require a greater overhaul

  1. based on step 1 -- tidy up actions.py as it seems to be overgeneralised in some cases and potentially contains redundant code; making it simpler might also make it more maintainable in the future
  2. address typos in the docs and make them more generalisable (i.e. remove namespace references from the manifests, fix typo(s) in the ServingRuntime configuration and some unnecessary image refrences in the ISVCs

… API contract

- Require --return_probabilities flag for all detectors (enable threshold-based filtering)
- Add softmax transformation to handle both logits and probabilities
- Restrict to KServe V1 protocol only (remove V2 support)
- Support only 2 response formats: sequence classification and token classification dicts
- Remove 5 unsupported format handlers (integers, entities, single values, booleans, named labels)
- Change safe_labels type from List[Union[int, str]] to List[int]
- Add --backend=huggingface flag to ServingRuntime
- Standardize all detector YAMLs to use args format
- Remove unnecessary image references from InferenceServices
- Add API contract documentation explaining V1-only decision
- Update detection flow to clarify HTTP/HTTPS support
- Remove namespace hardcoding from all examples
if isinstance(prediction, dict) and all(
str(k).isdigit() or isinstance(k, int) for k in prediction.keys()
):
# Convert logits to probabilities if needed
Copy link
Collaborator

@m-misiura m-misiura Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is really needed since when --return_probabilities is set, KServe already applies softmax before returning the response; see e.g. this

Subsequently, it should be fine to remove the softmax function entirely (L21-26) and check on (L72-75)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: softmax - Keeping it for now. KServe v0.14/v0.15 return logits despite --return_probabilities flag (bug fixed in unreleased v0.16). The softmax detection logic handles both logits and probabilities defensively, ensuring compatibility across KServe versions. Will remove once v0.16.0 stable is released.


# Sequence classification - probability/logit distributions
# Format: {"0": 0.994, "1": 0.006} or {"0": 1.12, "1": -1.53}
if isinstance(prediction, dict) and all(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this check actually useful? after JSON deserialisation, keys will always be strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - removed redundant isinstance(k, int) check. JSON deserialization always produces string keys."

detected_classes = []

for class_id_key, prob in prediction.items():
class_id = int(class_id_key) if isinstance(class_id_key, str) else class_id_key
Copy link
Collaborator

@m-misiura m-misiura Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if keys are always strings, then the else is redundant

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Fixed - simplified to class_id = int(class_id_key) since JSON keys are always strings. Removed redundant else clause."

if isinstance(user_message, dict):
user_message = user_message.get("content", "")

if not user_message.strip():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean empty messages are fast-forwarded to the LLM? if so, what is the justification for not passing such messages through the detectors?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - removed empty message bypass. Empty messages now go through the full detector pipeline before reaching the LLM.

config = context.get("config")

if not config:
return {"allowed": False, "reason": "No configuration"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this uses reason, but e.g. on L359, it uses "error" -- I think there could be an API inconsistency, which should be standardised

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - standardized all error returns to use "reason" field for API consistency.

content_blocks = []
allowing = []

for i, result in enumerate(results):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the behaviour of this function in case the ordering assumption breaks (i.e. dict keys iteration no longer matches execution order)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - replaced index-based matching with explicit tuples (detector_name, task). Detector names are now retrieved from the tuple rather than assuming dict key order matches result order, eliminating the fragile ordering dependency.

@@ -0,0 +1,432 @@
"""
Copy link
Collaborator

@m-misiura m-misiura Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I think this would benefit from adding Pydantic response models to outline the API contract; this is not currently the case as the file uses untyped dicts. Using Pydantic response models would allow to better discover the API and find out e.g. what fields are required and what fields are optional

additionally, you may want to potentially consider adding some kind of Pydantic validation to better tackle handling of e.g. empty messages like on L239

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - added Pydantic models (DetectorResult and AggregatedDetectorResult) with explicit Field definitions to make the API contract discoverable. All fields are clearly marked as required or optional with defaults.
Re: Pydantic validation for empty messages - didn't add this as we're now passing empty messages through the detector pipeline (as discussed), rather than validating/blocking them at the input level.

}


async def _call_kserve_endpoint(endpoint: str, text: str, timeout: int) -> Dict[str, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this create async session for every request? is this correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed - now using a shared aiohttp.ClientSession created at module load and reused across all detector calls. This eliminates per-request session overhead while maintaining per-request timeout flexibility.

if not kserve_detectors:
return {"allowed": True, "reason": "No detectors configured"}

log.info(f"Running {len(kserve_detectors)} detectors: {list(kserve_detectors.keys())}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does synchronous logging work in the async context?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: sync logging - Since Python's logging module uses buffered I/O and given our logging frequency is low (per user message), the event loop impact is negligible. I have used standard logging to stay consistent with NeMo Guardrails' own async action implementations.

Core Improvements:
- Add Pydantic response models for clear API contracts and type validation
- Implement shared aiohttp session to eliminate per-request overhead
- Remove empty message bypass to ensure all input goes through detector pipeline

Code Quality Fixes:
- Standardize all error responses to use 'reason' field
- Fix fragile dict ordering with explicit task-name tupling
- Remove redundant JSON key type checks (keys are always strings post-deserialization)
- Simplify class ID conversions by removing unreachable branches

Note: Softmax logic retained for KServe v0.14/v0.15 compatibility (bug fixed in v0.16)
return f"Input blocked by {det['detector']} detector (risk: {det['risk_type']}, score: {det['score']:.2f})"

# Multiple detectors blocked
detector_names = [d['detector'] for d in blocking]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dict access might be unsafe as you might lose Pydantic validation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Dict access is intentional and safe. We use Pydantic models internally for validation during object creation (DetectorResult(...)), then convert to plain dicts via .dict() because NeMo's Colang runtime requires plain data structures and cannot access Pydantic object attributes. Validation occurs before conversion, so no safety is lost. This pattern matches other NeMo actions.

config: Optional[Any] = None,
detector_type: str = "toxicity",
**kwargs
) -> Dict[str, Any]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these type hints reflective of the Pydantic data models?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Type hints are accurate. Functions return Dict[str, Any] which correctly reflects the return value after .dict() conversion. We create Pydantic objects internally for validation, then convert to plain dicts for Colang compatibility. The type hints truthfully describe what gets returned (dict), not the internal Pydantic models.

DEFAULT_TIMEOUT = 30

# Shared HTTP session for all detector calls
_http_session = aiohttp.ClientSession()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this be considered a resource leak as the session would be created at module import time? additionally, this connection seems to remain open indefinitely?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Changed from module-level _http_session = aiohttp.ClientSession() to lazy initialization with asyncio lock inside _call_kserve_endpoint() to avoid import-time side effects and resource leaks.

@@ -0,0 +1,444 @@
"""
Generic KServe Detector Integration for NeMo Guardrails
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name is not reflective of the fact that this is only compatible with a specific type of huggingface server -- this should be reflected throughout out the code

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Updated file docstring and function comments from "Generic KServe" to "KServe HuggingFace Detector Integration" to reflect that this integration requires KServe HuggingFace runtime with --return_probabilities flag.

kind: PersistentVolumeClaim
metadata:
name: phi3-model-pvc
namespace: <your-namespace>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove these references altogether

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed all namespace: references from YAML examples in documentation.

value: production
- name: OPENAI_API_KEY
value: sk-dummy-key
- name: SAFE_LABELS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are safe labels specified in both configmap and deployment? it would be much better to have it only in the configmap

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed SAFE_LABELS environment variable from Deployment YAML. Safe labels are now configured only in ConfigMap per-detector settings.

default=30,
description="HTTP request timeout in seconds"
)
risk_type: Optional[str] = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how valuable tihs field is and perhaps to simplify the config, it could be removed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed risk_type field from both KServeDetectorConfig and DetectorResult. The field was redundant since detector name already identifies the risk category, and system errors are distinguished by label="ERROR".

claimName: phi3-model-pvc
---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to show how to configure this with authenticated services too, here is a potential example

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: hap-detector
  annotations:
    serving.knative.openshift.io/enablePassthrough: "true"
    sidecar.istio.io/inject: "true"
    sidecar.istio.io/rewriteAppHTTPProbers: "true"
    serving.kserve.io/deploymentMode: RawDeployment
    security.opendatahub.io/enable-auth: "true"
spec:
  predictor:
    minReplicas: 1
    maxReplicas: 2
    model:
      modelFormat:
        name: huggingface
      args:
        - --model_name=hap-detector
        - --model_id=ibm-granite/granite-guardian-hap-38m
        - --task=sequence_classification
      resources:
        requests:
          cpu: "1"
          memory: "2Gi"
        limits:
          cpu: "2"
          memory: "4Gi"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added Authentication section to documentation with InferenceService auth annotations example and NeMo token configuration. Also added auth annotations to all detector deployment YAMLs.

persistentVolumeClaim:
claimName: phi3-model-pvc
nodeSelector:
node.kubernetes.io/instance-type: g4dn.2xlarge
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove any instances to node instance-type in these manifests

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed all nodeSelector blocks referencing node instance types from documentation YAML examples.

- name: nemo-guardrails
image: quay.io/rh-ee-stondapu/trustyai-nemo:latest
imagePullPolicy: Always
env:
Copy link
Collaborator

@m-misiura m-misiura Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be useful to also show how to add authentication to e.g. detectors in here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added KSERVE_API_KEY environment variable example to NeMo deployment documentation and showed per-detector api_key configuration in ConfigMap examples.


api_key = os.getenv("KSERVE_API_KEY")
if api_key:
headers["Authorization"] = f"Bearer {api_key}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should detector endpoints only accept the same bearer token for authentication?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added api_key: Optional[str] field to KServeDetectorConfig for per-detector authentication. Updated _call_kserve_endpoint() to use detector-specific token with fallback to global KSERVE_API_KEY env var. Supports per-detector, global, and unauthenticated patterns.

Copy link
Collaborator

@m-misiura m-misiura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting closer!

Next, if you can, please focus on:

  • updating the user guide to demonstrate how to set this up with authenticated services if possible as well removing references to namespaces and node instance-types in the manifests
  • addressing comments pertaining actions.py (e.g. inconsistent type hints, shared https session resource leak) and config.py (e.g. the necessity of certain fields)
  • adding some unit and perhaps intergation tests

- Removed risk_type field (redundant with detector name)
- Implemented per-detector authentication with global fallback
- Added lazy session initialization with lock (fix resource leak)
- Fixed naming to reflect HuggingFace-only compatibility
- Added authentication documentation with ODH example
- Removed namespace and node-type references from docs
- Added 19 unit/integration tests (all passing)
- Added auth annotations to detector deployment YAMLs
@m-misiura m-misiura force-pushed the develop branch 2 times, most recently from b7d6750 to 603bf72 Compare November 10, 2025 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants