Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 18 additions & 17 deletions opendatahub/docs/get-started-odh-modelserving.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,30 @@
# Getting Started
# Getting Started with ODH ModelMesh Serving

## Prerequisites
The provided deploy script allows you to quickly run OpenDataHub ModelMesh Serving with a provisioned `etcd` server. This deploy script is intended for experimentation or development purposes only. It is not intended for production purposes.

- **OpenShift cluster** - A OpenShift cluster is required. You will need `cluster-admin` or `dedicated-admin` authority in order to complete all of the prescribed steps.
## Prerequisites

- **OpenShift cli** - The installation will occur via the terminal using [oc](https://docs.openshift.com/container-platform/4.11/cli_reference/openshift_cli/getting-started-cli.html#cli-installing-cli_cli-developer-commands).
- You have `cluster-admin` access to an OpenShift cluster.

- **Model storage** - The model files have to be stored in a compatible form of remote storage or on a Kubernetes Persistent Volume.
- You have installed the **OpenShift CLI** as described in [Installing the OpenShift CLI by downloading the binary](https://docs.openshift.com/container-platform/4.11/cli_reference/openshift_cli/getting-started-cli.html#cli-installing-cli_cli-developer-commands).

We provide an deploy script to quickly run OpenDataHub ModelMesh Serving with a provisioned etcd server. This may be useful for experimentation or development but should not be used in production.
- Your model files are stored in a compatible form of remote storage or on a Kubernetes persistent volume.

## Namespace Scope

ModelMesh Serving can be used in either cluster scope or namespace scope mode but OpenDataHub ModelServing only support namespace scope mode.

- **Namespace scope mode** - All of its components must exist within a single namespace and only one instance of ModelMesh Serving can be installed per namespace. Multiple ModelMesh Serving instances can be installed in separate namespaces within the cluster.
While ModelMesh Serving is available in either cluster scope or namespace scope mode, OpenDataHub ModelServing only supports namespace scope mode.

With Namespace scope, you can configure more than one ModelMesh Serving instance on a cluster. However, you must configure each instance of ModelMesh Serving in a separate namespace and all of the ModelMesh Serving instance components must exist within that single namespace.

## Deployed Components

| | Type | Pod | Count | Default CPU request/limit per-pod | Default mem request/limit per-pod |
| ---------- | ---------------- | -------------------------- | ------- | --------------------------------- | ------------------------------------------ |
| 1 | Controller | Modelmesh Controller pod | 3 | 50m / 1 | 96Mi / 2Gi |
| 2 | Object Storage | MinIO pod (optional) | 1 | 0m / 0m | 0Mi / 0Mi |
| 3 | Metastore | ETCD pod | 1 | 200m / 300m | 100Mi / 200Mi |
| 4 | Built-in Runtime | The OVMS Runtime Pods | 0 \(\*) | 500m / 5 | 1Gi / 1Gi |
| 5 | ODH Model Controller | ODH Model Controller pod | 3 | 10m / 500m | 64Mi / 2Gi |
| **totals** | | | 3 | 880m / 9.4 | 1.58Gi / 13.2Gi |
The following table describes the components deployed for each ModelMesh Serving instance.

| Component Type | Pod Name | Number of Pods | Default CPU Request/Limit per Pod | Default Memory request/Limit per Pod |
| ---------------- | -------------------------- | ------- | --------------------------------- | ------------------------------------------ |
| Controller | Modelmesh Controller pod | 3 | 50m / 1 | 96Mi / 2Gi |
| Object Storage | MinIO pod (optional) | 1 | 0m / 0m | 0Mi / 0Mi |
| Metastore | ETCD pod | 1 | 200m / 300m | 100Mi / 200Mi |
| Built-in Runtime | The OVMS Runtime Pods | 0 \(\*) | 500m / 5 | 1Gi / 1Gi |
| ODH Model Controller | ODH Model Controller pod | 3 | 10m / 500m | 64Mi / 2Gi |
| **Totals** | | 3 | 880m / 9.4 | 1.58Gi / 13.2Gi |
34 changes: 18 additions & 16 deletions opendatahub/quickstart/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
# Quick Start
# Overview of the OpenDataHub's ModelServing Quick Starts

This documentation aims to assist you in utilizing OpenDataHub's ModelServing effectively. It focuses on explaining new features introduced in the OpenDataHub modelmesh or odh-model-controller and provides relevant examples to enhance comprehension.
The purpose of these quick starts is help you learn how to use OpenDataHub's ModelServing. They describe features in the OpenDataHub ModelMesh or odh-model-controller and provide relevant examples.

Please note that while these documents have been verified for accuracy at the time of their creation, there is a possibility that manifests or scripts may become outdated and not function as intended over time.
**Note:** These quick starts have been verified for accuracy at the time of their creation, but the manifests or scripts might become outdated and not function as originally intended.

Our objective is to provide comprehensive and professional guidance to ensure seamless utilization of OpenDataHub's ModelServing.
## List of quick starts

The folder structure for the Quickstart is as follows:
- [Sample Model Deployment](./basic/README.md)
- [Sample Model Deployment and Autoscaler](./hpa/README.md)
- [Sample Model Deployment by using a Persistent Volume Claim](./pvc/README.md)

## Quick start files

Each quick start folder contains the following files:
~~~
|-- basic
|-- deploy.sh # Script to deploy ODH Modelmesh and Quickstart objects
|-- clean.sh # Script to delete all Quickstart objects
|-- README.md # Documentation providing an explanation of the Quickstart
|-- deploy.sh # Script to deploy the OpenDataHub ModelMesh and all quick start objects
|-- clean.sh # Script to delete all quick start objects
|-- README.md # Documentation that describes how to run the quick start
~~~

## Requirement
## Requirements for running the quick starts

- OpenShift Cluster 4.11+
- Default StorageClass
- OpenShift CLI 4.11+
- At least 8 vCPU and 16 GB memory. For more details, please see [here](../docs/get-started-odh-modelserving.md).
- User have cluster-admin role.
- At least 8 vCPU and 16 GB memory. For more details, see [Getting Started with ODH ModelMesh Serving](../docs/get-started-odh-modelserving.md).
- You must have `cluster-admin` access to the OpenShift cluster.

## Quick Start List

- [Sample Model Deployment](./basic/README.md)
- [Autoscaler Feature](./hpa/README.md)
- [PVC Feature](./pvc/README.md)
82 changes: 51 additions & 31 deletions opendatahub/quickstart/basic/README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,12 @@
# Quick Start Guide - Sample Model Deployment
# Quick Start - Sample Model Deployment

This Quick Start Guide provides instructions for deploying the OpenDataHub Modelserving component along with NFS Provisioner, Minio, and a sample PVC. Additionally, it automatically deploys a sample model and allows you to test OpenDataHub Modelserving using its inference service.
Welcome to the quick start for deploying a sample model and testing OpenDataHub ModelServing by using its provided inference service.

## Prerequisites

You can check prerequisites from [this doc](../README.md)

## Install OpenDataHub ModelServing

Please refer to [this doc](../docs/modelmesh-install.md)
## Description of the inference service manifest YAML files

## Deploy a Sample Model

The `deploy.sh` also deployed a sample model so you don't need to create additional inference service. However, to help you understand, here's a brief description of the inference service manifests yaml.
There are two inference service manifest YAML files that this quick start uses to specify a model path: `storageUri` or `storagePath`

These are the yaml files that quick start used. There are 2 ways to specify a model path: `storageUri`, `storeagePath`

- **storeagePath**
- **storagePath**
~~~
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
Expand All @@ -35,7 +25,7 @@ spec:
path: onnx/mnist.onnx
~~~

- **storeageUri**
- **storageUri**
~~~
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
Expand All @@ -52,42 +42,70 @@ spec:
storageUri: s3://modelmesh-example-models/onnx/mnist.onnx
~~~

### How to check model deployment status
You can check if your model is ready by checking the inference service, where you can get the `gRPC` URL to access your model.
## Prerequisites

- Verify that you meet the requirements for running the quick starts listed in [Overview of the OpenDataHub's ModelServing Quick Starts](../README.md).
- Install OpenDataHub ModelServing as described in [Installing OpenDataHub ModelServing](../common_docs/modelmesh-install.md).

## Deploy a Sample Model

Deploy the sample model:
~~~
./deploy.sh
~~~

## Check the model deployment status

1. Check whether your model is ready by getting the OpenDataHub ModelServing's inference service:
~~~
$ oc get isvc -n modelmesh-serving
~~~
You should see a result similar to the following:
~~~
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
example-onnx-mnist grpc://modelmesh-serving.modelmesh-serving:8033 True 4m
~~~
Note that this result includes the `gRPC` URL that you can use to access the model.

For the `HTTP` URL, you can check routes.
2. To obtain the `HTTP` URL for the model, use the command to get routes:
~~~
$ oc get routes
~~~
You should see a result similar to the following:
~~~
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
example-onnx-mnist example-onnx-mnist-modelmesh-serving.apps.jlee-test.l9ew.p1.openshiftapps.com /v2/models/example-onnx-mnist modelmesh-serving 8008 edge/Redirect None
~~~
## Perform an inference request

Now that a model is loaded and available, you can then perform inference. OpenDataHub ModelServing has another controller `odh-model-controller` that is responsible for creating OpenShift Route for the model. You can manage the feature with the annotation `enable-route: "true"` in the ServingRuntime. Plus, the controller also manages authentication with the annotation `enable-auth: "true"` in the ServingRuntime. (By default, both features are disabled but for this quick start, `enable-route` was set to `true`)
## Perform inference requests

After the model has deployed, you can perform inference requests. OpenDataHub ModelServing includes the `odh-model-controller` controller that is responsible for creating an OpenShift Route for the model and for authentication. These features are set with the `enable-route` and `enable-auth` ServingRuntime annotations. By default, both features are disabled (the annotations are set to `false`), but for this quick start, `enable-route` is set to `true`.

**Curl Test with no authentication enabled**
The following `curl` examples demonstrate how to perform inference requests.

**Curl test without authentication enabled**
~~~
export HOST_URL=$(oc get route example-onnx-mnist -ojsonpath='{.spec.host}' -n ${TEST_MM_NS})
export HOST_PATH=$(oc get route example-onnx-mnist -ojsonpath='{.spec.path}' -n ${TEST_MM_NS})
export HOST_PATH=$(oc get route example-onnx-mnist -ojsonpath='{.spec.path}' -n ${TEST_MM_NS})

curl --silent --location --fail --show-error --insecure https://${HOST_URL}${HOST_PATH}/infer -d @${COMMON_MANIFESTS_DIR}/input-onnx.json
curl --silent --location --fail --show-error --insecure https://${HOST_URL}${HOST_PATH}/infer -d @${COMMON_MANIFESTS_DIR}/input-onnx.json

{"model_name":"example-onnx-mnist__isvc-b29c3d91f3","model_version":"1","outputs":[{"name":"Plus214_Output_0","datatype":"FP32","shape":[1,10],"data":[-8.233053,-7.7497034,-3.4236815,12.3630295,-12.079103,17.266596,-10.570976,0.7130762,3.321715,1.3621228]}]}
~~~

**Note**: If Authentication is enabled, the route port should be `8080`.
**Note**: If authentication is enabled, the route port should be `8080`.

~~~
$ oc get route
~~~
You should see a result similar to the following:
~~~
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
example-onnx-mnist example-onnx-mnist-modelmesh-serving.apps.jlee-test.l9ew.p1.openshiftapps.com /v2/models/example-onnx-mnist modelmesh-serving 8008 edge/Redirect None
~~~

**gRPC Curl Test using port-forward**
**gRPC Curl test by using port-forward**

~~~
oc port-forward --address 0.0.0.0 service/modelmesh-serving 8033 -n ${TEST_MM_NS}

Expand All @@ -103,10 +121,9 @@ grpcurl \
cd -
~~~

**Curl Test with authentication enabled**

Here we also show the case of enabling authentication for testing purposes. This can be done by simply setting the annotation 'enable-auth` in the servingrungime to true. When this feature is enabled, the token of the user who has access to this route should be sent along with it.
**Curl test with authentication enabled**

You can enable authentication for testing purposes by setting the `enable-auth` annotation in the ServingRuntime to `true`. When you enable authentication, you should also send the token of the user with access to the route.
~~~
# Enable Auth for OVMS ServingRuntime
oc apply -f ${COMMON_MANIFESTS_DIR}/sa_user.yaml -n ${TEST_MM_NS}
Expand All @@ -121,13 +138,16 @@ curl -H "Authorization: Bearer ${Token}" --silent --location --fail --show-erro
{"model_name":"example-onnx-mnist__isvc-b29c3d91f3","model_version":"1","outputs":[{"name":"Plus214_Output_0","datatype":"FP32","shape":[1,10],"data":[-8.233053,-7.7497034,-3.4236815,12.3630295,-12.079103,17.266596,-10.570976,0.7130762,3.321715,1.3621228]}]}
~~~

**Note**: If Authentication is enabled, the route port should be `8443`.
**Note**: If authentication is enabled, the route port should be `8443`.
~~~
$ oc get route
~~~
You should see a result similar to the following:
~~~
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
example-onnx-mnist example-onnx-mnist-modelmesh-serving.apps.jlee-test.l9ew.p1.openshiftapps.com /v2/models/example-onnx-mnist modelmesh-serving 8443 reencrypt/Redirect None
~~~

## Cleanup

Please refer to [this doc](../docs/modelmesh-cleanup.md)
Follow the steps in [Cleaning up an OpenDataHub ModelServing installation](../common_docs/modelmesh-cleanup.md).
10 changes: 4 additions & 6 deletions opendatahub/quickstart/common_docs/modelmesh-cleanup.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,13 @@
# Cleanup
# Cleaning up an OpenDataHub ModelServing installation

## Cleanup Quick Start
Run one of the following commands to cleanup up the Quick Start files:

This will delete modelmesh and modelmesh test namespace(minio,pvc)
- If you want to try another quick start, run this command to delete the `modelmesh` and modelmesh test namespaces (`minio` and `pvc`):
~~~
./cleanup.sh
~~~

## Cleanup all components

This will delete modelmesh, modelmesh test namespace(minio,pvc) and nfs provisioner. If you want to try other quick start guide, please use above command.
- If you are done with all of the quickstarts, run this command to deletes the `modelmesh` and modelmesh test namespaces (`minio` and `pvc`) and NFS provisioner:
~~~
C_FULL=true ./cleanup.sh
~~~
Loading