Argo Workflows Architecture

Yashod Perera
4 min readAug 13, 2024

--

Workflow is anything which is doing sequential, parallel or combination of both set of tasks.

I hope that you have a basic understanding of what is a workflow and its primary constructs.

Photo by Kier in Sight Archives on Unsplash

What are argo workflows?

Argo workflows provides a Kubernetes native workflow engine support for step-based and Directed Acyclic Graph (DAG) workflows. Which can be used for general purpose tasks such as

  • Machine Learning
  • Data Processing
  • CI/CD
  • Infrastructure automations
  • And for anything which can be constructed as set of tasks…
Step Based Workflow vs DAG Workflow

Step based workflows run in a sequence though it provide some flexibility to do parallel tasks but in DAG you have the flexibility to create your own order.

Components of Argo Workflows

Argo workflows consists of 8 CRDs and 2 Deployments which are as follows.

CRDs

  • Workflow — which defines the structure, configuration, and behavior of a workflow.
  • CronWorkflow — which allows to schedule workflows at regular intervals. It provides a way to automate the execution of workflows on a recurring schedule.
  • WorkflowTemplate — which is a reusable and shareable template that defines a workflow structure.
  • ClusterWorkflowTemplate — which is similar to WorkflowTemplate but scoped to cluster.
  • WorkflowTaskSet — which exchange data between argo workflow controller and argo exec agent which is responsible for running steps.
  • WorkflowTaskResult — which is an internal concept used to represent the outcome or result of a task within a workflow.
  • WorkflowEventBindings — which is a feature that allows you to trigger workflows based on external events.
  • WorkflowArtifactGCTasks — which manages the garbage collection (GC) of artifacts produced by workflows.

Workflow Controller

  • Argo workflow does the heavy lifting which listen to the relevant custom resources and interact with the Kubernetes API to create the underline resources.
  • Workflow controller was build using operator pattern.
  • Can run in HA mode or name-spaced deployment mode.
How argo workflow works

Argo Server

  • Argo server is used to communicate with outside world which provide backbone to argo cli and UI which is not a mission critical component.
  • Argo server provide some features such as, Workflow archiving, log access, authentication and authorization etc.

Workflow Creation Flow

Please note that following is more focus on how a simple workflow creates and executed it will simply checkout and build an image.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: checkout-and-build-
spec:
entrypoint: build-workflow
volumeClaimTemplates:
- metadata:
name: workspace-pvc
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
templates:
- name: build-workflow
steps:
- - name: checkout-code
template: checkout-code
- - name: build-image
template: build-image

- name: checkout-code
script:
image: alpine/git
command: [sh]
source: |
set -e
git clone https://github.com/your-repo/your-project.git /workspace
volumeMounts:
- name: workspace-pvc
mountPath: /workspace

- name: build-image
script:
image: docker:20.10.7
command: [sh]
source: |
set -e
cd /workspace/your-project
docker build -t your-image:latest .
volumeMounts:
- name: workspace-pvc
mountPath: /workspace

In argo it start with the entry-point which is defined in entrypoint of the yaml. Argo workflow handles the workflow execution and it will first search for the defined template from the template section and execute it.

In the above example build-workflow is the entry point and it has two steps as follows.

name: build-workflow
steps:
- - name: checkout-code
template: checkout-code
- - name: build-image
template: build-image

Each step is defined using a template which can be found in the template section. Then for the above workflow it will create resources as follows.

  1. PVC for volumeClaimTemplates
  2. 2 Pods for each step

Each pod has three containers which are,

  • Init container — fetching artifacts and parameters and making them available to the main container
  • Main container — container runs the Image that the user indicated and run the relevant commands
  • Wait container — container performs tasks that are needed for clean up, including saving off parameters and artifacts

It will execute the workflow as follows.

  1. Create PVC to store temporary data
  2. Create checkout-code pod and checkout the code to the PVC
  3. Create build-image pod and use the checkout code in the previous step from the PVC and build the image

Hope you got the basic understanding of the Argo architecture and how a workflow is managed.

Resources

If you have found this helpful please hit that 👏 and share it on social media :).

--

--

Yashod Perera

Technical Writer | Tech Enthusiast | Open source contributor