Explain Blue-Green deployment implementation including infrastructure setup, traffic switching, database considerations, rollback procedures, and CI/CD integration.

Question

Accepted Answer

Blue-Green deployment maintains two identical production environments enabling instant traffic switching. Infrastructure setup requires: two complete environments (compute, networking, databases), load balancer or DNS for traffic routing, monitoring for both environments, automation for deployment and switching.

Infrastructure patterns:

1. Load Balancer switching (AWS ELB):
```yaml
# Terraform example
resource "aws_lb_target_group" "blue" {
  name     = "app-blue"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id
  
  health_check {
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
  }
}

resource "aws_lb_target_group" "green" {
  name     = "app-green"
  port     = 80
  protocol = "HTTP"
  vpc_id   = aws_vpc.main.id
  
  health_check {
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 2
  }
}

resource "aws_lb_listener" "main" {
  load_balancer_arn = aws_lb.main.arn
  port              = 80
  protocol          = "HTTP"
  
  default_action {
    type             = "forward"
    target_group_arn = var.active_environment == "blue" ? 
                       aws_lb_target_group.blue.arn : 
                       aws_lb_target_group.green.arn
  }
}
```

2. Kubernetes Service switching:
```yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
    version: blue  # Switch to 'green' for deployment
  ports:
    - port: 80
      targetPort: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-blue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: blue
  template:
    metadata:
      labels:
        app: myapp
        version: blue
    spec:
      containers:
      - name: myapp
        image: myapp:v1
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: myapp
        image: myapp:v2
```

Deployment process:

1. Deploy to inactive environment (Green)
2. Run smoke tests on Green
3. Run full test suite on Green
4. Switch small percentage for canary testing (optional)
5. Monitor Green environment metrics
6. Switch all traffic to Green
7. Monitor for issues
8. Keep Blue running as backup

Database considerations (biggest challenge):

Backward compatible migrations:
```sql
-- Phase 1: Add new column (nullable)
ALTER TABLE users ADD COLUMN email_verified BOOLEAN DEFAULT NULL;

-- Deploy Green with code using new column
-- Both Blue and Green work

-- Phase 2: Backfill data
UPDATE users SET email_verified = FALSE WHERE email_verified IS NULL;

-- Phase 3: Make non-nullable (after Blue retired)
ALTER TABLE users ALTER COLUMN email_verified SET NOT NULL;
```

Database patterns:

1. Shared database (both Blue and Green use same DB):
   - Simple but requires backward compatible migrations
   - Blue and Green must handle old and new schema

2. Database per environment:
   - Complete isolation
   - Complex data synchronization
   - Expensive (duplicate data)

3. Read replicas:
   - Blue uses primary, Green uses replica for testing
   - Promote replica during switch

Traffic switching automation:
```bash
#!/bin/bash
# blue-green-switch.sh

CURRENT_ENV=$1
NEW_ENV=$2

echo "Current: $CURRENT_ENV, Switching to: $NEW_ENV"

# Deploy to new environment
kubectl apply -f k8s/$NEW_ENV/

# Wait for pods ready
kubectl wait --for=condition=ready pod -l version=$NEW_ENV --timeout=300s

# Run smoke tests
./smoke-tests.sh https://$NEW_ENV.internal.example.com

if [ $? -ne 0 ]; then
  echo "Smoke tests failed, aborting"
  exit 1
fi

# Switch traffic
kubectl patch service myapp -p '{"spec":{"selector":{"version":"'$NEW_ENV'"}}}'

echo "Switched to $NEW_ENV"

# Monitor for 5 minutes
sleep 300

# Check error rate
ERROR_RATE=$(curl -s "https://metrics.example.com/error-rate?env=$NEW_ENV")

if [ "$ERROR_RATE" -gt 5 ]; then
  echo "High error rate: $ERROR_RATE%, rolling back"
  kubectl patch service myapp -p '{"spec":{"selector":{"version":"'$CURRENT_ENV'"}}}'
  exit 1
fi

echo "Deployment successful"
```

CI/CD integration:
```yaml
stages:
  - build
  - deploy-green
  - test-green
  - switch-traffic
  - cleanup

build:
  stage: build
  script:
    - docker build -t myapp:$CI_COMMIT_SHA .
    - docker push myapp:$CI_COMMIT_SHA

deploy-green:
  stage: deploy-green
  script:
    - export NEW_ENV="green"
    - export OLD_ENV="blue"
    - ./deploy.sh $NEW_ENV $CI_COMMIT_SHA
  environment:
    name: green
    url: https://green.internal.example.com

test-green:
  stage: test-green
  script:
    - ./smoke-tests.sh https://green.internal.example.com
    - ./integration-tests.sh https://green.internal.example.com

switch-traffic:
  stage: switch-traffic
  script:
    - ./blue-green-switch.sh blue green
  when: manual
  environment:
    name: production
  only:
    - main

cleanup-blue:
  stage: cleanup
  script:
    - kubectl scale deployment myapp-blue --replicas=1
  when: manual
```

Rollback procedure:
```bash
#!/bin/bash
# Quick rollback
kubectl patch service myapp -p '{"spec":{"selector":{"version":"blue"}}}'
echo "Rolled back to blue"
```

Monitoring during switch:
```python
import requests
import time

def monitor_switch(environment, duration=300):
    metrics = ['error_rate', 'latency_p95', 'throughput']
    start = time.time()
    
    while time.time() - start < duration:
        for metric in metrics:
            value = get_metric(environment, metric)
            
            if is_anomaly(metric, value):
                print(f"Anomaly detected: {metric}={value}")
                rollback()
                return False
        
        time.sleep(10)
    
    return True
```

Best practices: test rollback procedures regularly, automate everything, implement health checks, monitor business metrics not just technical, use gradual switch (weighted routing) for additional safety, maintain backward compatible migrations, document database migration strategy, implement automated rollback on anomalies, keep Blue running for configured time after switch, use feature flags for additional safety layer. Understanding Blue-Green enables zero-downtime deployments with instant rollback capability.

Master Interviews
Anywhere, Anytime

Explain Blue-Green deployment implementation including infrastructure setup, traffic switching, database considerations, rollback procedures, and CI/CD integration.

Problem Statement

Explanation

Practice Sets

Related Questions

What is the main goal of a CI/CD pipeline?

Which statement best separates Continuous Integration from Continuous Delivery?

Choose the common order of stages in a basic pipeline.

What usually triggers a CI run in modern platforms?

What is a build artifact in CI/CD?

More from CI/CD Pipelines

Master Interviews Anywhere, Anytime

Explain Blue-Green deployment implementation including infrastructure setup, traffic switching, database considerations, rollback procedures, and CI/CD integration.

Problem Statement

Explanation

Practice Sets

Related Questions

What is the main goal of a CI/CD pipeline?

Which statement best separates Continuous Integration from Continuous Delivery?

Choose the common order of stages in a basic pipeline.

What usually triggers a CI run in modern platforms?

What is a build artifact in CI/CD?

More from CI/CD Pipelines

Master Interviews
Anywhere, Anytime