Authenticated gRPC service on Kubernetes

May 1, 2023
Share this post
https://www.truefoundry.com/blog/authenticated-grpc-service-on-eks-with-istio
URL

In our blog series on Kubernetes, we talked about buildings scalable MLOps on Kubernetes, architecture for MLOps, and solving application development. In this blog, we will talk about hosting a GRPC service on AWS EKS cluster. The process is roughly going to be the same for every Kubernetes cluster — however, we had to do some specific settings on the AWS Load balancer for this to work.

What is gRPC & our usecase

gRPC is an open-source RPC framework that can run in any environment. It is capable of efficiently connecting services within and between data centers, with the ability to plug in support for load balancing, tracing, health checking, and authentication.

Our usecase:  Hosting Tensorflow models as APIs that accepted a payload of around 100MB in size. GRPC performs much better for larger payloads — so we exposed the GRPC port on port 5000.

Hosting the service

We hosted the service on Kubernetes using the Deployment YAML below:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: ml-api
 namespace: ml-services
spec:
 replicas: 1
 selector:
   matchLabels:
     truefoundry.com/component: ml-api
 template:
   metadata:
     labels:
       truefoundry.com/application: ml-api
   spec:
     containers:
       - name: ml-api
         image: >-
           XXXX.dkr.ecr.us-east-1.amazonaws.com/ml-services-ml-api:latest
         ports:
           - name: port-8500
             containerPort: 8500
             protocol: TCP
         resources:
           limits:
             cpu: '4'
             ephemeral-storage: 2G
             memory: 4G
           requests:
             cpu: '1'
             ephemeral-storage: 1G
             memory: 500M
         imagePullPolicy: IfNotPresent
     restartPolicy: Always
     terminationGracePeriodSeconds: 30
     dnsPolicy: ClusterFirst
     securityContext: {}
     imagePullSecrets:
       - name: ml-api-image-pull-secret
     schedulerName: default-scheduler
 strategy:
   type: RollingUpdate
   rollingUpdate:
     maxUnavailable: 25%
     maxSurge: 0

This will bring up the pod. We need to create the Service object using the YAML below:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
 labels:
   argocd.argoproj.io/instance: tfy-istio-ingress
 name: tfy-wildcard
 namespace: istio-system
spec:
 selector:
   istio: tfy-istio-ingress
 servers:
   - hosts:
       - 'ml.example.com'
     port:
       name: http-tfy-wildcard
       number: 80
       protocol: HTTP
     tls:
       httpsRedirect: true
   - hosts:
       - 'ml.example.com'
     port:
       name: https-tfy-wildcard
       number: 443
       protocol: HTTP

Exposing the service

We are using Istio as the ingress layer in Kubernetes. Istio provisions a Load Balancer when the istio-ingress is installed. The load balancer configuration can be customized using annotations on the istio gateway. The spec for creating the Istio Gateway is as follows:

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
 labels:
   argocd.argoproj.io/instance: tfy-istio-ingress
 name: tfy-wildcard
 namespace: istio-system
spec:
 selector:
   istio: tfy-istio-ingress
 servers:
   - hosts:
       - 'ml.example.com'
     port:
       name: http-tfy-wildcard
       number: 80
       protocol: HTTP
     tls:
       httpsRedirect: true
   - hosts:
       - 'ml.example.com'
     port:
       name: https-tfy-wildcard
       number: 443
       protocol: HTTP

We are doing the SSL termination on the AWS Load Balancer. For this we have to attach the certificate to the Load Balancer. This can be achieved using the annotations below to the istio gateway chart (https://istio-release.storage.googleapis.com/charts).

"service.beta.kubernetes.io/aws-load-balancer-type": "nlb"
"service.beta.kubernetes.io/aws-load-balancer-backend-protocol": "tcp"
"service.beta.kubernetes.io/aws-load-balancer-ssl-cert": "<certificate-arn>"
"service.beta.kubernetes.io/aws-load-balancer-ssl-ports": "https"
"service.beta.kubernetes.io/aws-load-balancer-alpn-policy": "HTTP2Preferred"

The alpn-policy is important to specify to allow GRPC traffic. Our service ml-api can be exposed by creating a VirtualService pointing to the Kubernetes service. The YAML for the Virtual Service is as follows:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
 labels:
   argocd.argoproj.io/instance: ml-services_ml-api
 name: ml-apiport-8500-vs
 namespace: ml-services
spec:
 gateways:
   - istio-system/tfy-wildcard
 hosts:
   - ml.example.com
 http:
   - route:
       - destination:
           host: ml-api
           port:
             number: 8500

Once the virtual service is exposed, we can make requests to our service at ml.example.com. We then wanted to add an authentication to the API so that everybody cannot call the API. We could have added the authentication in the code, but we decided to add it at the istio layer so that it can be a unified layer across all services.

To add authentication at the istio-ingress layer, we decided to go ahead with a IstioWasm Plugin. The yaml for the plugin looks something like:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
 name: ml-services-ml-api-0
 namespace: istio-system
spec:
 phase: AUTHN
 pluginConfig:
   basic_auth_rules:
     - credentials:
         - username:password
       hosts:
         - ml.example.com
       prefix: /
       request_methods:
         - GET
         - PUT
         - POST
         - PATCH
         - DELETE
 selector:
   matchLabels:
     istio: tfy-istio-ingress
 url: oci://ghcr.io/istio-ecosystem/wasm-extensions/basic_auth:1.12.0

Once you have applied the above spec to the cluster, the app will ask for the username and password once you open it in the browser.

We have made it super easy on TrueFoundry

To make the above process much easier, we decide to make it really easy on the Truefoundry platform.

GRPC service on Kubernetes
Build, Train, and Deploy LLM/ML Faster
Start Your Free 7-Day Trial Now!

Discover More

May 16, 2024

Volumes on Kubernetes

Kubernetes
May 3, 2024

Top Tools for Fine-tuning

Engineering and Product
February 29, 2024

SSH Server Containers For Development on Kubernetes

Authentication
Engineering and Product
March 6, 2024

Prompting, RAG or Fine-tuning - the right choice?

Engineering and Product

Related Blogs

No items found.

Blazingly fast way to build, track and deploy your models!

pipeline