Mlflow with Helm and serve Train Model on kubernetes

5 min readAug 6, 2023

Part 7: Serve Model API on Cluster

Part 1: Install Mlflow on Local Machine
Part 2: Train example model and keep in Mlflow Local Machine
Part 3: Expose example api on Local Machine
Part 4: API Transform for Model API
Part 5: Install Mlflow on GKE Cluster with helm
Part 6: Keep Model in Mlflow remote Cluster
Part 7: Serve Model API on Cluster

In this topic I will show how to serve model with api server

Build Model Images

Create Dockerfile

FROM python:3.9

ENV HOME="/root"
WORKDIR ${HOME}

# RUN pip install mlflow==2.1.1 google-cloud-storage pathlib==1.0.1 lz4==3.1.3 psutil==5.9.0 typing-extensions==4.3.0 cloudpickle==2.2.1
RUN pip install google-cloud-storage

COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

ENV MODEL_URI ${MODEL_URI}
ENV SERVING_PORT ${SERVING_PORT}

# RUN apt-get install -y git
# RUN git clone --depth=1 https://github.com/pyenv/pyenv.git .pyenv
# ENV PYENV_ROOT="${HOME}/.pyenv"
# ENV PATH="${PYENV_ROOT}/shims:${PYENV_ROOT}/bin:${PATH}"

COPY serving.sh /serving.sh

CMD [ "/bin/bash", "/serving.sh" ]
# CMD [ "sh", "-c", "mlflow models serve --model-uri $MODEL_URI -h 0.0.0.0 -p $SERVING_PORT --no-conda"]

Create requirements.txt

mlflow==2.1.1
pathlib==1.0.1
lz4==3.1.3 
psutil==5.9.0 
typing-extensions==4.3.0 
cloudpickle==2.2.1

Create serving.sh

#!/bin/sh

mlflow models serve --model-uri $MODEL_URI -h 0.0.0.0 -p $SERVING_PORT --no-conda

All code here: https://github.com/dounpct/soc-ml-api
Build images

docker build -t mlflow .

Push images to gcr

docker tag mlflow:latest asia.gcr.io/gcp-devops/devops/mlflow_serving:latest
docker push asia.gcr.io/gcp-devops/devops/mlflow_serving:latest

Note: please create CI with your tools such as Jenkins, Cloud Build , GitHub Action and so on.

Deploy Images with ArgoCD

Create application mlflow-app

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: mlflow-app
spec:
  project: mlflow
  source: 
    repoURL: 'https://github.com/dounpct/argocd-deployment.git'
    path: mlflow-app
    targetRevision: master
    plugin: {}
  destination:
    server: 'https://kubernetes.default.svc'
    namespace: mlflow-app
  syncPolicy:
    syncOptions:
      - CreateNamespace=true

Create folder mlflow-app
Create folder plugin and empty file cmp-plugin-yaml.yaml. It because mlflow-app will be only yaml folder and need cmp-plugin-yaml.yaml for letting ArgoCD know for render plugin.

Create ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-with-auth-mlflow-app
  annotations:
    # type of authentication
    nginx.ingress.kubernetes.io/auth-type: basic
    # name of the secret that contains the user/password definitions
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    # message to display with an appropriate context why the authentication is required
    nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
spec:
  ingressClassName: nginx
  rules:
  - host: nginx-auth-test.local.net
    http:
      paths:
      - path: /invocations
        pathType: Prefix
        backend:
          service: 
            name: mlflow-serving
            port: 
              number: 8082

Create secret.yaml

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: gcsfs-creds
data:
  keyfile.json: <path:projects/362159383816/secrets/google_credentials_json#google_credentials_json | base64encode>
---
apiVersion: v1
kind: Secret
metadata:
  name: gcr-authen
type: kubernetes.io/dockerconfigjson
data:
  .dockerconfigjson: <path:projects/362159383816/secrets/GCR_DOCKERCONFIGJSON#GCR_DOCKERCONFIGJSON | base64encode>
---
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: basic-auth
data:
  auth: <path:projects/362159383816/secrets/tdg_ingress_basic_authen#tdg_ingress_basic_authen | base64encode>
  user: <path:projects/362159383816/secrets/tdg_ingress_basic_authen_user#tdg_ingress_basic_authen_user | base64encode>
  password: <path:projects/362159383816/secrets/tdg_ingress_basic_authen_password#tdg_ingress_basic_authen_password | base64encode>

Go to GCS and copy gsutil URI

Create application.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-serving
  labels:
    app: serve-ML-model-mlflow

spec:
  replicas: 1
  selector:
    matchLabels:
      app: mlflow-serving
  template:
    metadata:
      labels:
        app: mlflow-serving

    spec:
      containers:
        - name: mlflow-serving
          image: asia.gcr.io/xxxxxxxx/mlflow_serving:latest #change here
          env:
            - name: MODEL_URI
              value: "gs://xxxxxxxxxx/artifacts/model" #change here
            - name: SERVING_PORT
              value: "8082"
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: "/etc/secrets/keyfile.json"
          volumeMounts:
            - name: gcsfs-creds
              mountPath: "/etc/secrets"
              readOnly: true
          resources:
            limits:
              cpu: 1000m
              memory: 600Mi
            requests:
              cpu: 500m
              memory: 300Mi
      imagePullSecrets:
      - name: gcr-authen
      
      volumes:
        - name: gcsfs-creds
          secret:
            secretName: gcsfs-creds
            items:
              - key: keyfile.json
                path: keyfile.json
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: mlflow-serving
  name: mlflow-serving
spec:
  ports:
  - name: http-web
    port: 8082
    protocol: TCP
    targetPort: 8082
  selector:
    app: mlflow-serving
  type: ClusterIP
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: mlflow-serving
spec:
  minReplicas: 1  
  maxReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mlflow-serving
  targetCPUUtilizationPercentage: 60

All code here: https://github.com/dounpct/argocd-deployment
Commit to repo and let ArgoCD syn

Check application on GKE

Check ingress

kubectl get ing -n mlflow-app

Test API

curl -X POST -H "Content-Type:application/json"                     \
    --data "{\"dataframe_split\": {\"data\":[[                          \
        0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  
        0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  
        0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,
        0.0,  0.0,  1.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  0.0,  
        0.0,  0.0,  1.0,  0.0]]}}"                                      \
    http://nginx-auth-test.local.com/invocations -u 'ingress_user:xxxxxxxxxxxxxxxxxxxxxx' | jq

### RESULT ###
{
    "predictions": [
        "yes"
    ]
}

Great as you see we already have API that can scale with HPA to serve traffic to predict request
Note I have mlflow both on GKE (GCS and S3 Minio) so this is for Minio

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mlflow-serving
  labels:
    app: serve-ML-model-mlflow
spec:
  selector:
    matchLabels:
      app: mlflow-serving
  template:
    metadata:
      labels:
        app: mlflow-serving
    spec:
      containers:
        - name: mlflow-serving
          image: gcr.io/xxxxxxxxxxx/ml-training-api:v0.0.5 #change here
          env:
            - name: MODEL_URI
              value: "s3://mlflow/1/xxxxxxxxxxxxxxxxxxx/artifacts/model" #change here
            - name: SERVING_PORT
              value: "8082"          
            - name: MLFLOW_S3_ENDPOINT_URL
              value: "https://minio-ml-hl.minio-ml.svc.cluster.local:9000"          
            - name: MLFLOW_S3_IGNORE_TLS
              value: "true"      
          resources:
            limits:
              cpu: 1000m
              memory: 600Mi
            requests:
              cpu: 500m
              memory: 300Mi
          envFrom:
          - secretRef:
              name: mlflow-app-secrets
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: mlflow-serving
  name: mlflow-serving
spec:
  ports:
  - name: http-web
    port: 8082
    protocol: TCP
    targetPort: 8082
  selector:
    app: mlflow-serving
  type: ClusterIP
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: mlflow-serving
spec:
  minReplicas: 1  
  maxReplicas: 3
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mlflow-serving
  targetCPUUtilizationPercentage: 60

Have fun !!!

This is End of “Mlflow with Helm and serve Train Model on kubernetes”

— — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Credit : TrueDigitalGroup

— — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Mlflow with Helm and serve Train Model on kubernetes

This is End of “Mlflow with Helm and serve Train Model on kubernetes”

Written by Dounpct

No responses yet