Problem Statement
Implement GitOps workflow using ArgoCD or Flux including repository structure, sync policies, progressive delivery, and disaster recovery.
Explanation
GitOps uses Git as single source of truth for declarative infrastructure and applications. Operators continuously reconcile desired state (Git) with actual state (cluster). Changes made via pull requests, automatic sync to cluster, rollback via git revert.
Repository structure:
```
gitops-repo/
├── apps/
│ ├── myapp/
│ │ ├── base/
│ │ │ ├── deployment.yaml
│ │ │ ├── service.yaml
│ │ │ └── kustomization.yaml
│ │ └── overlays/
│ │ ├── dev/
│ │ │ ├── kustomization.yaml
│ │ │ └── patch.yaml
│ │ ├── staging/
│ │ │ └── kustomization.yaml
│ │ └── production/
│ │ └── kustomization.yaml
├── infrastructure/
│ ├── ingress-nginx/
│ ├── cert-manager/
│ └── monitoring/
└── clusters/
├── dev/
├── staging/
└── production/
```
ArgoCD Application definition:
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-production
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/gitops-repo
targetRevision: main
path: apps/myapp/overlays/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true # Delete resources not in Git
selfHeal: true # Sync if cluster state drifts
allowEmpty: false
syncOptions:
- CreateNamespace=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
```
Kustomize overlays for environments:
```yaml
# apps/myapp/base/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 1
template:
spec:
containers:
- name: myapp
image: myapp:latest
resources:
requests:
memory: "256Mi"
cpu: "250m"
---
# apps/myapp/base/kustomization.yaml
resources:
- deployment.yaml
- service.yaml
---
# apps/myapp/overlays/production/kustomization.yaml
bases:
- ../../base
namespace: production
images:
- name: myapp
newTag: v1.2.3
replicas:
- name: myapp
count: 5
patchesStrategicMerge:
- |-
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: myapp
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
```
Flux GitRepository and Kustomization:
```yaml
# infrastructure/sources/gitops-repo.yaml
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
name: gitops-repo
namespace: flux-system
spec:
interval: 1m
url: https://github.com/org/gitops-repo
ref:
branch: main
secretRef:
name: git-credentials
---
# clusters/production/myapp.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: myapp
namespace: flux-system
spec:
interval: 10m
path: ./apps/myapp/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: gitops-repo
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: myapp
namespace: production
timeout: 5m
```
Progressive delivery with Flagger:
```yaml
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: myapp
namespace: production
spec:
provider: istio
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
progressDeadlineSeconds: 600
service:
port: 80
analysis:
interval: 1m
threshold: 5
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 1m
- name: request-duration
thresholdRange:
max: 500
interval: 1m
```
GitOps workflow:
1. Developer commits changes:
```bash
# Update image tag
cd apps/myapp/overlays/production
kustomize edit set image myapp=myapp:v1.2.4
git add .
git commit -m "Update myapp to v1.2.4"
git push origin main
```
2. ArgoCD/Flux detects change:
```bash
# ArgoCD syncs automatically or manually
argocd app sync myapp-production
# Flux reconciles
flux reconcile kustomization myapp
```
3. Monitor deployment:
```bash
# Watch ArgoCD
argocd app get myapp-production --watch
# Watch Flux
flux get kustomizations --watch
# Check pods
kubectl get pods -n production -w
```
4. Rollback if needed:
```bash
# Git revert
git revert HEAD
git push origin main
# ArgoCD/Flux automatically syncs to previous state
```
Disaster recovery:
```bash
# Backup ArgoCD applications
kubectl get applications -n argocd -o yaml > argocd-backup.yaml
# Backup Flux resources
flux export source git gitops-repo > flux-sources.yaml
flux export kustomization myapp > flux-kustomizations.yaml
# Restore cluster from Git
# 1. Install ArgoCD/Flux
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# 2. Apply backed up applications
kubectl apply -f argocd-backup.yaml
# 3. Applications sync from Git automatically
# Cluster restored to last known good state
```
PR-based workflow:
```yaml
# .github/workflows/validate-manifests.yml
name: Validate Manifests
on:
pull_request:
paths:
- 'apps/**'
- 'infrastructure/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Validate Kubernetes manifests
run: |
find apps -name '*.yaml' -exec kubectl apply --dry-run=client -f {} \;
- name: Kustomize build
run: |
kustomize build apps/myapp/overlays/production
- name: Run conftest policies
uses: instrumenta/conftest-action@master
with:
files: apps/
policy: policy/
```
Multi-cluster management:
```yaml
# ArgoCD ApplicationSet for multiple clusters
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: myapp
spec:
generators:
- list:
elements:
- cluster: dev
url: https://dev.k8s.example.com
- cluster: staging
url: https://staging.k8s.example.com
- cluster: production
url: https://prod.k8s.example.com
template:
metadata:
name: 'myapp-{{cluster}}'
spec:
project: default
source:
repoURL: https://github.com/org/gitops-repo
targetRevision: main
path: 'apps/myapp/overlays/{{cluster}}'
destination:
server: '{{url}}'
namespace: default
syncPolicy:
automated:
prune: true
selfHeal: true
```
Best practices: single source of truth (all configuration in Git), separate app and config repos (or use branching), use declarative configuration (no imperative commands), implement PR reviews for changes, use automated validation and testing, implement RBAC for ArgoCD/Flux, use sealed secrets for sensitive data, implement drift detection and alerts, document GitOps workflows, regularly sync and prune unused resources, implement disaster recovery procedures. Understanding GitOps enables reliable, auditable, declarative deployments.