Argo Workflows Architecture
Workflow is anything which is doing sequential, parallel or combination of both set of tasks.
I hope that you have a basic understanding of what is a workflow and its primary constructs.
What are argo workflows?
Argo workflows provides a Kubernetes native workflow engine support for step-based and Directed Acyclic Graph (DAG) workflows. Which can be used for general purpose tasks such as
- Machine Learning
- Data Processing
- CI/CD
- Infrastructure automations
- And for anything which can be constructed as set of tasks…
Step based workflows run in a sequence though it provide some flexibility to do parallel tasks but in DAG you have the flexibility to create your own order.
Components of Argo Workflows
Argo workflows consists of 8 CRDs and 2 Deployments which are as follows.
CRDs
- Workflow — which defines the structure, configuration, and behavior of a workflow.
- CronWorkflow — which allows to schedule workflows at regular intervals. It provides a way to automate the execution of workflows on a recurring schedule.
- WorkflowTemplate — which is a reusable and shareable template that defines a workflow structure.
- ClusterWorkflowTemplate — which is similar to WorkflowTemplate but scoped to cluster.
- WorkflowTaskSet — which exchange data between argo workflow controller and argo exec agent which is responsible for running steps.
- WorkflowTaskResult — which is an internal concept used to represent the outcome or result of a task within a workflow.
- WorkflowEventBindings — which is a feature that allows you to trigger workflows based on external events.
- WorkflowArtifactGCTasks — which manages the garbage collection (GC) of artifacts produced by workflows.
Workflow Controller
- Argo workflow does the heavy lifting which listen to the relevant custom resources and interact with the Kubernetes API to create the underline resources.
- Workflow controller was build using operator pattern.
- Can run in HA mode or name-spaced deployment mode.
Argo Server
- Argo server is used to communicate with outside world which provide backbone to argo cli and UI which is not a mission critical component.
- Argo server provide some features such as, Workflow archiving, log access, authentication and authorization etc.
Workflow Creation Flow
Please note that following is more focus on how a simple workflow creates and executed it will simply checkout and build an image.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: checkout-and-build-
spec:
entrypoint: build-workflow
volumeClaimTemplates:
- metadata:
name: workspace-pvc
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
templates:
- name: build-workflow
steps:
- - name: checkout-code
template: checkout-code
- - name: build-image
template: build-image
- name: checkout-code
script:
image: alpine/git
command: [sh]
source: |
set -e
git clone https://github.com/your-repo/your-project.git /workspace
volumeMounts:
- name: workspace-pvc
mountPath: /workspace
- name: build-image
script:
image: docker:20.10.7
command: [sh]
source: |
set -e
cd /workspace/your-project
docker build -t your-image:latest .
volumeMounts:
- name: workspace-pvc
mountPath: /workspace
In argo it start with the entry-point which is defined in entrypoint
of the yaml. Argo workflow handles the workflow execution and it will first search for the defined template from the template section and execute it.
In the above example build-workflow
is the entry point and it has two steps as follows.
name: build-workflow
steps:
- - name: checkout-code
template: checkout-code
- - name: build-image
template: build-image
Each step is defined using a template which can be found in the template section. Then for the above workflow it will create resources as follows.
- PVC for
volumeClaimTemplates
- 2 Pods for each step
Each pod has three containers which are,
- Init container — fetching artifacts and parameters and making them available to the
main
container - Main container — container runs the Image that the user indicated and run the relevant commands
- Wait container — container performs tasks that are needed for clean up, including saving off parameters and artifacts
It will execute the workflow as follows.
- Create PVC to store temporary data
- Create
checkout-code
pod and checkout the code to the PVC - Create
build-image
pod and use the checkout code in the previous step from the PVC and build the image
Hope you got the basic understanding of the Argo architecture and how a workflow is managed.
Resources
- Documentation — https://argo-workflows.readthedocs.io/en/latest/
- Helm Chart — https://github.com/argoproj/argo-helm
If you have found this helpful please hit that 👏 and share it on social media :).