Argo CD: The GitOps Way to Deploy ML applications

Use Argo CD to deploy your ML applications to a kubernetes cluster

9 min readApr 2, 2023

Introduction

Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes applications. It automates the deployment of applications to Kubernetes clusters by monitoring changes to a Git repository and continuously applying the desired state to the cluster.

Argo CD allows you to define the desired state of your application as Kubernetes manifests, Helm charts, or any other configuration format supported by Kubernetes. You can also define application dependencies and custom rollouts using strategies like blue-green and canary deployments.

Argo CD provides a web UI, CLI, and API to manage and monitor the application deployments. It also integrates with other Kubernetes tools like Prometheus for monitoring, Istio for traffic management, and Vault for secrets management.

Argo CD is an open-source project, and you can use it to deploy and manage your applications on any Kubernetes cluster.

Argo CD for ML Applications

Argo CD addresses the problem of deployments for ML applications by providing a declarative and GitOps-based approach to managing application deployments in Kubernetes clusters. Here are some key ways Argo CD addresses this problem:

Declarative approach: Argo CD uses a declarative approach to manage application deployments, which means you define the desired state of the application in a Git repository and let Argo CD continuously apply the changes to the cluster. This makes it easier to manage and version control deployments, and reduces the risk of configuration drift or human error.
GitOps methodology: Argo CD follows the GitOps methodology, which means that all changes to the application deployments are made through Git commits and pull requests. This ensures that all changes are tracked, reviewed, and audited, and reduces the risk of unauthorized changes or misconfigurations.
Integration with ML platforms and tools: Argo CD integrates with ML platforms and tools like Kubeflow, TensorFlow, PyTorch, and MLflow, which makes it easier to build, train, and deploy ML models and applications in a scalable and efficient way. It also provides built-in support for common ML deployment scenarios, such as canary releases and blue-green deployments.
Scalability and customization: Argo CD is designed for managing large-scale Kubernetes deployments and can handle complex application dependencies, custom rollout strategies, and multiple environments. It also provides a web UI, CLI, and API for managing and monitoring deployments, which makes it easier to customize and automate the deployment process.
Security and compliance: Argo CD supports best practices for security and compliance, such as RBAC (role-based access control) and integration with external secrets management tools like HashiCorp Vault. This ensures that the deployment process is secure and compliant with organizational policies and regulations.

In summary, Argo CD provides a declarative and GitOps-based approach to continuous deployment of ML applications, which makes it easier to manage the complexity of deploying ML models and applications to Kubernetes clusters.

Given below is a step-by-step guide on how to install and use Argo CD to manage applications in a kubernetes cluster. This guide is written using a kind cluster but other clusters can be used including minikube, Amazon EKS, etc.

Scenario

Consider a scenario of a streaming service with about 1 million customers and 27,000 movies, and you are required to build a recommendation service for this streaming service. One of the requirements is to take care of the operations of this recommendation service in production, which entails many concerns, one important one being deployment.

Deployment is an important concern in ML applications because it determines how the ML model or algorithm is integrated into a production environment and how it will be used to make predictions or decisions. Here are some key reasons why deployment is critical for ML applications:

Real-world impact: ML models are designed to solve real-world problems, and their effectiveness depends on how well they are deployed and integrated into production systems. Deploying an ML model that doesn’t perform well or doesn’t integrate with existing systems can have a negative impact on the business or organization.
Scalability: As ML models become more complex and data volumes grow, deployment can become a bottleneck for scaling up the application. Deploying ML models in a scalable and efficient way is critical for handling large volumes of data and maintaining fast response times.
Maintenance and updates: ML models require ongoing maintenance and updates to continue performing well over time. Deployment practices should facilitate easy maintenance and updates of the ML application, without disrupting the production environment.

All application code and kubernetes yaml definitions can be found at: https://github.com/cmu-seai/group-project-s23-the-incredi-codes/tree/cdgamaro/argocd

Step 1: Creating the kubernetes cluster using kind

Make sure docker is running, and use the following configuration to bring up a kubernetes cluster.

kind create cluster --config k8s/kind/config.yaml

Once the cluster comes up successfully, we can move on to installing Argo CD on this cluster.

Step 2: Install Argo CD

Start by creating a namespace and then deploying Argo CD.

kubectl create ns argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

This will create a number of resources in the argocd namespace.

Step 3: Make the UI accessible outside the cluster

By default, the argocd-server kubernetes service is not accessible outside the cluster. So in order to make it accessible, one of the methods that be used is to perform port-forwarding.

kubectl port-forward -n argocd service/argocd-server 8443:443

Note: You can also use other methods like changing the service to a NodePort service or using an Ingress resource.

The Argo CD UI should now be accessible at https://localhost:8443/

In order to get the password to log in to the dashboard, run the following command. The default username is admin.

kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d && echo

And voila! You have successfully accessed the Argo CD UI and are ready to deploy your applications.

Step 3: Add GitHub repository on Argo CD UI

Open “Settings” from the left-hand side menu, and click on “Repositories”.

In order to provide access, create a personal access token for your GitHub repositories following the instructions at this link.

Enter the https URL of the repository, along with a username and the newly created personal access token.

This will add the repository and provide access to the kubernetes manifests present in this repository. You should see it listed as shown below.

Step 4: Create a new application for the recommendation service

Click on “New App” on the “Applications” screen. Give the application name and choose the “default” project.

Since the repository has been added, you should see it listed under URLs. You should also be able to see all the branches for this repository. Choose the revision you want and provide the exact path to the yaml definitions for the kubernetes resources you want to create.

The yaml definitions for the kubernetes resources are also shown below:

Choose the cluster URL as “https://kubernetes.default.svc” and give the name of the namespace you want this application created in.

Finally, click on “Create”. You should see the application on your applications screen.

Step 5: Synchronizing the application

You should see the status of the newly created application as “Missing” and “OutOfSync”. This is expected and needs to be manually synchronized based on the configured settings. Click on “Sync” and then, click on “Synchronize”.

The application status will change from “Progressing” and finally to “Healthy”.

You can use “kubectl” to verify that all the defined resources have been created in the “movies” namespace.

Step 6: Testing the service

In order for the flask-service to be accessible, you need to perform a similar kind of port-forwarding as shown below.

kubectl -n movies port-forward service/flask-service 2211:80

This will now make the prediction service available, and can be tested from the browser or using cURL. An example of the results is show below.

Step 7: Monitoring

Instead of using “kubectl” CLI tool, you can directly monitor your pods from the Argo CD UI.

Click on the app.

Click on the flask-deployment-pod.

You will be able to see an “Events” tab and a “Logs” tab.

Step 8: Updating the application

If you make any change in the github repository, and click on “Refresh” on the applications screen, you will see that it goes “OutOfSync”.

Click on “Synchronize” to make sure that it is up to date with what is on git.

Step 9: Deleting the application

To delete the application, simply click on “Delete” and type in the name to confirm the deletion.

Step 10: Delete kind cluster

Once done, you can delete the entire cluster by running:

kind delete cluster

Strengths and Weaknesses

Argo CD is a powerful tool for automating the deployment of applications to Kubernetes clusters. Here are some strengths and limitations of Argo CD:

Strengths:

Declarative and GitOps-based approach: Argo CD uses a declarative approach and follows the GitOps methodology, which makes it easier to manage and version control application deployments. You can define the desired state of the application in a Git repository and let Argo CD continuously apply the changes to the cluster.
Scalability and customization: Argo CD is designed for managing large-scale Kubernetes deployments and can handle complex application dependencies, custom rollout strategies, and multiple environments. It also provides a web UI, CLI, and API for managing and monitoring deployments.
Integration with other tools: Argo CD integrates with other Kubernetes tools like Prometheus for monitoring, Istio for traffic management, and Vault for secrets management. It also supports integrations with ML platforms and tools like Kubeflow, TensorFlow, PyTorch, and MLflow.
Open-source and community-driven: Argo CD is an open-source project with a vibrant community of contributors and users. It has an active development roadmap and provides support for community-driven plugins and extensions.

Limitations:

Complexity: Argo CD can be complex to set up and configure, especially for users who are not familiar with Kubernetes and GitOps concepts. It requires a certain level of expertise in Kubernetes and DevOps practices.
Learning curve: Argo CD has a learning curve, and users need to understand the underlying concepts and configuration options to use it effectively. The documentation and community resources can help users get up to speed, but it can take some time to become proficient.
Resource usage: Argo CD requires additional resources to run and manage the deployment process. It needs a dedicated server or cluster to host the Argo CD components, and the server needs to have enough CPU, memory, and storage to handle the workload.
Limited support for non-Kubernetes environments: Argo CD is designed specifically for managing Kubernetes deployments and doesn’t provide out-of-the-box support for non-Kubernetes environments. This can limit its usefulness in organizations that use other container orchestration platforms or deployment environments.

Conclusion

Deployment is a critical concern in ML applications because it affects the real-world impact, scalability, security, compliance, and maintenance of the application. A well-designed deployment strategy can ensure that the ML application performs well, is secure and compliant, and can be maintained and updated easily over time.

This can also be integrated in a Jenkins pipeline for continuous deployment in ML applications.