Kfserving
kfserving-system/inferenceservice-config¶
- Most of the settings for kfserving is stored in this configmap
- For each predictors (tensorflow, onnx, sklearn, xgboost, pytorch, triton, pmml, lightgbm), the Image and image versions are configured here.
- AWS and GCP credentials if any are stored here
- Details of which Virtual Gateway and Ingress Service used are stored here
ingress: |- { "ingressGateway" : $(ingressGateway) "ingressService" : "istio-ingressgateway.istio-system.svc.cluster.local", "localGateway" : "cluster-local-gateway.knative-serving", "localGatewayService" : "cluster-local-gateway.istio-system.svc.cluster.local" }
- kfserving agent, batcher, logger, storageInitializer and explainer details are also store here.
To list the inference service:
kubectl get ksvc -n serving
NAME URL LATESTCREATED LATESTREADY READY REASON
mnist-isvc-predictor-default http://mnist-isvc-predictor-default-serving.kf.sb.us.aa.apollo.roche.com mnist-isvc-predictor-default-r6pbq mnist-isvc-predictor-default-r6pbq True
s3svc-predictor-default http://s3svc-predictor-default-serving.kf.sb.us.aa.apollo.roche.com s3svc-predictor-default-rb875 s3svc-predictor-default-rb875 True
A smaple Inference Service YAML file
apiVersion: serving.kubeflow.org/v1beta1
kind: InferenceService
metadata:
name: test-mn
namespace: serving
spec:
predictor:
canaryTrafficPercent: 100
serviceAccountName: kfs-sa
tensorflow:
name: kfserving-container
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: "1"
memory: 2Gi
runtimeVersion: 1.14.0
storageUri: s3://kubeflow-usw2-s3bucket-user-datasets-dev/project1/mnist
When the above yaml is applied, it creates the following resources in the backend.
- An Inferenceservice
- A Virtaul Service
- A Knative Serving service