GitOps의 실전 적용: ArgoCD, Flux로 구축하는 선언적 배포 시스템

GitOpsArgoCDFluxKubernetes배포자동화DevOps선언적배포

2026년 현재, GitOps는 Kubernetes 환경에서의 표준 배포 방법론으로 자리잡았습니다. "Git을 Single Source of Truth로 하는 운영 패러다임"으로 정의되는 GitOps는 전통적인 Push 기반 배포의 문제점들을 해결하고, 더 안전하고 추적 가능한 배포 환경을 제공합니다.

CNCF 2025 설문조사에 따르면, Kubernetes를 사용하는 기업의 73%가 GitOps 방법론을 채택했으며, ArgoCD와 Flux가 각각 42%, 35%의 점유율로 양대 산맥을 이루고 있습니다. 이들이 어떻게 현대적인 배포 인프라의 핵심이 되었는지, 실전에서 어떻게 구현하는지 살펴보겠습니다.

GitOps의 핵심 개념과 원칙

전통적 배포 방식의 문제점

# 전통적인 Push 기반 배포 (문제가 많은 방식)
# CI/CD 파이프라인이 직접 클러스터에 배포

# CI 파이프라인에서
kubectl apply -f deployment.yaml
kubectl set image deployment/myapp myapp:v1.2.3
kubectl rollout status deployment/myapp

# 문제점들:
# 1. 클러스터에 대한 직접 접근 권한 필요
# 2. 배포 권한이 너무 넓게 분산됨
# 3. 배포 상태와 Git의 불일치 가능성
# 4. 롤백 시 이전 상태를 정확히 알기 어려움
# 5. 멀티 클러스터 관리의 복잡성

GitOps의 Pull 기반 접근법

# GitOps 방식: Git Repository 구조 예시
# repo: myapp-config
├── environments/
│   ├── dev/
│   │   ├── kustomization.yaml
│   │   └── values.yaml
│   ├── staging/
│   │   ├── kustomization.yaml
│   │   └── values.yaml
│   └── production/
│       ├── kustomization.yaml
│       └── values.yaml
├── base/
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   └── kustomization.yaml
└── .argocd/
    └── application.yaml

# base/deployment.yaml - 기본 애플리케이션 정의
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
  labels:
    app: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myregistry/myapp:latest
        ports:
        - containerPort: 8080
        env:
        - name: ENV
          value: "production"
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5

---
# base/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: myapp-service
  labels:
    app: myapp
spec:
  selector:
    app: myapp
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
  type: ClusterIP

GitOps의 4대 원칙

  1. 선언적(Declarative): 시스템의 원하는 상태를 Git에 선언
  2. 버전 관리(Versioned): 모든 변경사항이 Git에 추적됨
  3. 자동 적용(Automatically Applied): Git 변경 시 자동으로 클러스터에 반영
  4. 지속적 조정(Continuously Reconciled): 실제 상태와 원하는 상태를 지속적으로 동기화

ArgoCD: Kubernetes Native GitOps

ArgoCD 설치와 초기 설정

# ArgoCD 설치
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# ArgoCD CLI 설치 (macOS)
brew install argocd

# ArgoCD 서버에 접근하기 위한 포트 포워딩
kubectl port-forward svc/argocd-server -n argocd 8080:443

# 초기 admin 패스워드 확인
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d

# CLI 로그인
argocd login localhost:8080 --username admin --password <password> --insecure

ArgoCD Application 정의

# .argocd/application.yaml - ArgoCD Application 리소스
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-production
  namespace: argocd
  labels:
    app.kubernetes.io/name: myapp
    app.kubernetes.io/env: production
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  # 프로젝트 설정
  project: default

  # 소스 Git 리포지토리
  source:
    repoURL: https://github.com/company/myapp-config
    targetRevision: HEAD
    path: environments/production

    # Kustomize 사용 시
    kustomize:
      images:
        - myregistry/myapp:v1.2.3

    # Helm 사용 시 (선택사항)
    # helm:
    #   valueFiles:
    #   - values.yaml
    #   parameters:
    #   - name: image.tag
    #     value: v1.2.3

  # 대상 클러스터
  destination:
    server: https://kubernetes.default.svc
    namespace: myapp-production

  # 동기화 정책
  syncPolicy:
    automated:
      prune: true      # 삭제된 리소스 자동 정리
      selfHeal: true   # 드리프트 자동 수정
      allowEmpty: false
    syncOptions:
    - CreateNamespace=true
    - PrunePropagationPolicy=foreground
    - PruneLast=true

    # 동기화 재시도 정책
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

  # 상태 확인 정책
  ignoreDifferences:
  - group: apps
    kind: Deployment
    jsonPointers:
    - /spec/replicas  # HPA에 의한 replica 변경 무시

  # 리소스 제외 설정
  info:
  - name: 'Environment'
    value: 'Production'
  - name: 'Team'
    value: 'Backend'

고급 ArgoCD 설정

# config/argocd-cm.yaml - ArgoCD 설정 맵
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-cm
  namespace: argocd
  labels:
    app.kubernetes.io/name: argocd-cm
    app.kubernetes.io/part-of: argocd
data:
  # Git 리포지토리 설정
  repositories: |
    - type: git
      url: https://github.com/company/myapp-config
      name: myapp-config
    - type: helm
      url: https://charts.bitnami.com/bitnami
      name: bitnami

  # SSO 설정 (OIDC)
  oidc.config: |
    name: Google
    issuer: https://accounts.google.com
    clientId: $oidc.google.clientId
    clientSecret: $oidc.google.clientSecret
    requestedScopes: ["openid", "profile", "email"]
    requestedIDTokenClaims: {"groups": {"essential": true}}

  # 리소스 헬스 체크 커스터마이징
  resource.customizations.health.argoproj.io_Rollout: |
    hs = {}
    if obj.status ~= nil then
      if obj.status.replicas ~= nil and obj.status.updatedReplicas ~= nil and obj.status.readyReplicas ~= nil and obj.status.availableReplicas ~= nil then
        if obj.status.replicas == obj.status.updatedReplicas and obj.status.replicas == obj.status.readyReplicas and obj.status.replicas == obj.status.availableReplicas then
          hs.status = "Healthy"
          hs.message = "Rollout is healthy"
          return hs
        end
      end
    end
    hs.status = "Progressing"
    hs.message = "Waiting for rollout to finish"
    return hs

  # 애플리케이션 템플릿
  application.instanceLabelKey: argocd.argoproj.io/instance

---
# config/argocd-rbac-cm.yaml - RBAC 설정
apiVersion: v1
kind: ConfigMap
metadata:
  name: argocd-rbac-cm
  namespace: argocd
  labels:
    app.kubernetes.io/name: argocd-rbac-cm
    app.kubernetes.io/part-of: argocd
data:
  policy.default: role:readonly
  policy.csv: |
    # 개발팀 권한
    p, role:developer, applications, get, */dev-*, allow
    p, role:developer, applications, sync, */dev-*, allow
    p, role:developer, repositories, get, *, allow
    p, role:developer, logs, get, */dev-*, allow

    # 운영팀 권한
    p, role:ops, applications, *, *, allow
    p, role:ops, clusters, *, *, allow
    p, role:ops, repositories, *, *, allow

    # 그룹 매핑
    g, company:developers, role:developer
    g, company:ops, role:ops
    g, admin@company.com, role:admin

다중 클러스터 관리

# 외부 클러스터 추가
argocd cluster add my-staging-cluster --name staging
argocd cluster add my-production-cluster --name production

# 클러스터별 Application 설정
# staging-cluster-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-staging
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/company/myapp-config
    targetRevision: HEAD
    path: environments/staging
  destination:
    name: staging  # 클러스터 이름으로 지정
    namespace: myapp-staging
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

---
# production-cluster-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-production
  namespace: argocd
spec:
  source:
    repoURL: https://github.com/company/myapp-config
    targetRevision: main  # 프로덕션은 main 브랜치만
    path: environments/production
  destination:
    name: production
    namespace: myapp-production
  syncPolicy:
    # 프로덕션은 수동 동기화
    syncOptions:
    - CreateNamespace=true
    retry:
      limit: 2
      backoff:
        duration: 10s
        factor: 2
        maxDuration: 5m

Flux: GitOps Toolkit의 새로운 표준

Flux v2 설치와 설정

# Flux CLI 설치
curl -s https://fluxcd.io/install.sh | sudo bash

# 클러스터 사전 체크
flux check --pre

# GitHub 토큰 설정 (Personal Access Token 필요)
export GITHUB_TOKEN=<your-token>
export GITHUB_USER=<your-username>
export GITHUB_REPO=fleet-infra

# Flux 부트스트랩
flux bootstrap github \
  --owner=$GITHUB_USER \
  --repository=$GITHUB_REPO \
  --branch=main \
  --path=./clusters/production \
  --personal

Flux GitRepository와 Kustomization

# clusters/production/myapp-source.yaml - GitRepository 리소스
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
  name: myapp-config
  namespace: flux-system
spec:
  interval: 5m
  url: https://github.com/company/myapp-config
  ref:
    branch: main
  secretRef:
    name: myapp-git-credentials

---
# clusters/production/myapp-kustomization.yaml - Kustomization 리소스
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: myapp-production
  namespace: flux-system
spec:
  interval: 10m
  sourceRef:
    kind: GitRepository
    name: myapp-config
  path: "./environments/production"
  prune: true
  wait: true
  timeout: 5m

  # 헬스 체크 설정
  healthChecks:
    - apiVersion: apps/v1
      kind: Deployment
      name: myapp
      namespace: myapp-production

  # 사전/사후 훅
  dependsOn:
    - name: myapp-secrets
    - name: myapp-configmaps

  # 알림 설정
  postBuild:
    substitute:
      cluster_name: "production"
      cluster_region: "us-west-2"

Flux Helm Controller 활용

# clusters/production/helm-repos.yaml - Helm Repository
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: bitnami
  namespace: flux-system
spec:
  interval: 24h
  url: https://charts.bitnami.com/bitnami

---
# clusters/production/redis-release.yaml - Helm Release
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: redis
  namespace: myapp-production
spec:
  interval: 15m
  chart:
    spec:
      chart: redis
      version: "17.3.7"
      sourceRef:
        kind: HelmRepository
        name: bitnami
        namespace: flux-system

  # 값 재정의
  values:
    auth:
      enabled: true
      password: "${redis_password}"
    master:
      persistence:
        enabled: true
        size: 8Gi
    replica:
      replicaCount: 2
      persistence:
        enabled: true
        size: 8Gi

  # 업그레이드 정책
  upgrade:
    remediation:
      retries: 3
  rollback:
    cleanupOnFail: true
    force: true

  # 테스트 설정
  test:
    enable: true
    timeout: 2m

  # 의존성
  dependsOn:
    - name: cert-manager
      namespace: cert-manager

Flux 알림 시스템

# clusters/production/notifications.yaml - 알림 설정
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Provider
metadata:
  name: slack
  namespace: flux-system
spec:
  type: slack
  channel: "#deployments"
  secretRef:
    name: slack-webhook-secret

---
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Alert
metadata:
  name: production-alerts
  namespace: flux-system
spec:
  providerRef:
    name: slack
  eventSeverity: info
  eventSources:
    - kind: Kustomization
      name: myapp-production
    - kind: HelmRelease
      name: redis
  summary: |
    Production deployment status:
    - Cluster: {{ .ExternalURL }}
    - Commit: {{ .Revision }}
    - Status: {{ .Reason }}

---
# Webhook을 위한 Secret
apiVersion: v1
kind: Secret
metadata:
  name: slack-webhook-secret
  namespace: flux-system
data:
  address: <base64-encoded-webhook-url>

실전 GitOps 워크플로우

개발에서 배포까지의 전체 플로우

#!/bin/bash
# scripts/deploy-pipeline.sh - 완전한 GitOps 파이프라인

set -euo pipefail

# 환경 변수
APP_NAME="myapp"
IMAGE_TAG="${GITHUB_SHA:0:7}"
CONFIG_REPO="https://github.com/company/myapp-config"
ENVIRONMENTS=("dev" "staging" "production")

# 1. 애플리케이션 빌드 및 이미지 푸시
build_and_push() {
    echo "Building and pushing image..."

    docker build -t "myregistry/${APP_NAME}:${IMAGE_TAG}" .
    docker push "myregistry/${APP_NAME}:${IMAGE_TAG}"

    # 보안 스캔
    docker scout cves "myregistry/${APP_NAME}:${IMAGE_TAG}"

    # latest 태그 업데이트
    docker tag "myregistry/${APP_NAME}:${IMAGE_TAG}" "myregistry/${APP_NAME}:latest"
    docker push "myregistry/${APP_NAME}:latest"
}

# 2. Kustomize 이미지 업데이트
update_kustomization() {
    local env=$1
    local image_tag=$2

    echo "Updating ${env} environment with image tag: ${image_tag}"

    cd config-repo

    # Kustomize로 이미지 태그 업데이트
    kustomize edit set image "myregistry/${APP_NAME}=${image_tag}" \
        -f "environments/${env}/kustomization.yaml"

    # 변경사항 커밋
    git add .
    git commit -m "Update ${env} image to ${image_tag}

    - Image: myregistry/${APP_NAME}:${image_tag}
    - Commit: ${GITHUB_SHA}
    - Author: ${GITHUB_ACTOR}
    - Ref: ${GITHUB_REF}"

    git push origin main
}

# 3. 배포 상태 확인
wait_for_deployment() {
    local env=$1
    local timeout=600  # 10분

    echo "Waiting for ${env} deployment to complete..."

    # ArgoCD 사용 시
    if command -v argocd &> /dev/null; then
        argocd app wait "${APP_NAME}-${env}" --timeout ${timeout}
        argocd app get "${APP_NAME}-${env}" --show-params
    fi

    # Flux 사용 시
    if command -v flux &> /dev/null; then
        flux get kustomizations "${APP_NAME}-${env}" --watch-timeout=${timeout}s
    fi

    # 헬스 체크
    kubectl rollout status "deployment/${APP_NAME}" -n "${APP_NAME}-${env}" --timeout=${timeout}s
}

# 4. 자동화된 테스트
run_post_deployment_tests() {
    local env=$1

    echo "Running post-deployment tests for ${env}..."

    # 서비스 헬스 체크
    kubectl wait --for=condition=Available deployment/${APP_NAME} -n "${APP_NAME}-${env}" --timeout=300s

    # 애플리케이션별 헬스 체크
    if [[ "${env}" == "staging" || "${env}" == "production" ]]; then
        # E2E 테스트 실행
        npm run test:e2e -- --env="${env}"

        # 성능 테스트
        npm run test:performance -- --env="${env}"
    fi
}

# 5. 프로모션 프로세스
promote_to_next_env() {
    local current_env=$1
    local next_env=$2
    local image_tag=$3

    echo "Promoting from ${current_env} to ${next_env}..."

    # 프로덕션으로의 프로모션은 승인 필요
    if [[ "${next_env}" == "production" ]]; then
        echo "Production promotion requires manual approval"
        # GitHub PR 생성 또는 승인 시스템 연동
        create_promotion_pr "${current_env}" "${next_env}" "${image_tag}"
    else
        # 자동 프로모션
        update_kustomization "${next_env}" "${image_tag}"
        wait_for_deployment "${next_env}"
        run_post_deployment_tests "${next_env}"
    fi
}

# 6. PR 기반 프로덕션 배포
create_promotion_pr() {
    local source_env=$1
    local target_env=$2
    local image_tag=$3

    # 새 브랜치 생성
    git checkout -b "promote-${target_env}-${IMAGE_TAG}"

    # 이미지 태그 업데이트
    kustomize edit set image "myregistry/${APP_NAME}=${image_tag}" \
        -f "environments/${target_env}/kustomization.yaml"

    git add .
    git commit -m "Promote to ${target_env}: ${image_tag}"
    git push origin "promote-${target_env}-${IMAGE_TAG}"

    # GitHub PR 생성 (gh CLI 사용)
    gh pr create \
        --title "🚀 Promote to ${target_env}: ${image_tag}" \
        --body "Promoting image \`${image_tag}\` from ${source_env} to ${target_env}

## Changes
- Image: \`myregistry/${APP_NAME}:${image_tag}\`
- Source: ${source_env}
- Target: ${target_env}

## Validation
- [ ] ${source_env} tests passed
- [ ] Security scan completed
- [ ] Performance benchmarks met

/cc @ops-team" \
        --assignee "${GITHUB_ACTOR}" \
        --label "deployment,${target_env}"
}

# 메인 실행 흐름
main() {
    case "${1:-}" in
        "build")
            build_and_push
            ;;
        "deploy")
            local env="${2:-dev}"
            update_kustomization "${env}" "myregistry/${APP_NAME}:${IMAGE_TAG}"
            wait_for_deployment "${env}"
            run_post_deployment_tests "${env}"
            ;;
        "promote")
            local from_env="${2:-staging}"
            local to_env="${3:-production}"
            promote_to_next_env "${from_env}" "${to_env}" "myregistry/${APP_NAME}:${IMAGE_TAG}"
            ;;
        "full-pipeline")
            build_and_push

            # Dev 환경 배포
            update_kustomization "dev" "myregistry/${APP_NAME}:${IMAGE_TAG}"
            wait_for_deployment "dev"
            run_post_deployment_tests "dev"

            # Staging 프로모션
            promote_to_next_env "dev" "staging" "myregistry/${APP_NAME}:${IMAGE_TAG}"

            # Production 프로모션 (수동 승인 필요)
            promote_to_next_env "staging" "production" "myregistry/${APP_NAME}:${IMAGE_TAG}"
            ;;
        *)
            echo "Usage: $0 {build|deploy|promote|full-pipeline}"
            exit 1
            ;;
    esac
}

# 스크립트 실행
main "$@"

GitHub Actions와 GitOps 통합

# .github/workflows/gitops-pipeline.yml
name: GitOps Pipeline

on:
  push:
    branches: [main]
    paths-ignore:
      - 'README.md'
      - 'docs/**'
  pull_request:
    branches: [main]

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
      image-digest: ${{ steps.build.outputs.digest }}

    steps:
    - name: Checkout repository
      uses: actions/checkout@v4

    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3

    - name: Log in to Container Registry
      uses: docker/login-action@v3
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}

    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=sha,prefix={{branch}}-
          type=sha,prefix={{branch}}-{{date 'YYYYMMDD'}}-

    - name: Build and push Docker image
      id: build
      uses: docker/build-push-action@v5
      with:
        context: .
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max

    - name: Run Trivy vulnerability scanner
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
        format: 'sarif'
        output: 'trivy-results.sarif'

    - name: Upload Trivy scan results
      uses: github/codeql-action/upload-sarif@v2
      if: always()
      with:
        sarif_file: 'trivy-results.sarif'

  deploy-dev:
    needs: build-and-test
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    permissions:
      contents: write

    steps:
    - name: Checkout config repository
      uses: actions/checkout@v4
      with:
        repository: company/myapp-config
        token: ${{ secrets.GITOPS_TOKEN }}
        path: config-repo

    - name: Update dev environment
      run: |
        cd config-repo

        # Kustomize 설치
        curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash

        # 이미지 태그 업데이트
        ./kustomize edit set image \
          myapp=${{ needs.build-and-test.outputs.image-tag }} \
          --file environments/dev/kustomization.yaml

        # Git 설정
        git config user.name "GitOps Bot"
        git config user.email "gitops@company.com"

        # 변경사항 커밋
        git add .
        git commit -m "🤖 Update dev image to ${{ github.sha }}

        - Image: ${{ needs.build-and-test.outputs.image-tag }}
        - Commit: ${{ github.sha }}
        - Author: ${{ github.actor }}
        - Workflow: ${{ github.run_id }}"

        git push origin main

  deploy-staging:
    needs: [build-and-test, deploy-dev]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: staging

    steps:
    - name: Wait for dev deployment
      run: |
        # ArgoCD CLI 또는 kubectl을 사용한 배포 대기
        echo "Waiting for dev deployment to stabilize..."
        sleep 60  # 실제로는 ArgoCD API 체크

    - name: Run integration tests
      run: |
        # Dev 환경에 대한 통합 테스트
        curl -f http://myapp-dev.company.com/health || exit 1

    - name: Update staging environment
      # dev와 동일한 패턴으로 staging 업데이트
      run: echo "Updating staging..."

  promote-production:
    needs: [build-and-test, deploy-staging]
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: production

    steps:
    - name: Create production promotion PR
      uses: actions/github-script@v7
      with:
        github-token: ${{ secrets.GITOPS_TOKEN }}
        script: |
          const { owner, repo } = context.repo;

          // PR 생성
          const pr = await github.rest.pulls.create({
            owner: 'company',
            repo: 'myapp-config',
            title: `🚀 Promote to production: ${context.sha.substring(0, 7)}`,
            head: `promote-prod-${context.sha.substring(0, 7)}`,
            base: 'main',
            body: `
            ## Production Promotion

            Promoting image from staging to production

            **Image:** \`${{ needs.build-and-test.outputs.image-tag }}\`
            **Commit:** ${context.sha}
            **Author:** ${context.actor}

            ## Pre-deployment Checklist
            - [x] All tests passed
            - [x] Security scan completed
            - [x] Staging validation successful
            - [ ] Load test completed
            - [ ] Security team approval
            - [ ] SRE team approval

            ## Rollback Plan
            Previous image: \`$(git log --oneline -n 1 environments/production/)\`

            cc: @sre-team @security-team
            `
          });

          // 리뷰어 할당
          await github.rest.pulls.requestReviewers({
            owner: 'company',
            repo: 'myapp-config',
            pull_number: pr.data.number,
            team_reviewers: ['sre-team', 'security-team']
          });

고급 GitOps 패턴과 모범 사례

Progressive Delivery with Argo Rollouts

# environments/production/rollout.yaml - 카나리 배포 설정
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: myapp
  namespace: myapp-production
spec:
  replicas: 10
  strategy:
    canary:
      # 단계별 트래픽 증가
      steps:
      - setWeight: 20   # 20% 트래픽
      - pause: {}       # 수동 승인 대기
      - setWeight: 40   # 40% 트래픽
      - pause: {duration: 10s}
      - setWeight: 60   # 60% 트래픽
      - pause: {duration: 10s}
      - setWeight: 80   # 80% 트래픽
      - pause: {duration: 10s}

      # 트래픽 라우팅 (Istio)
      trafficRouting:
        istio:
          virtualService:
            name: myapp-vs
          destinationRule:
            name: myapp-dr
            canarySubsetName: canary
            stableSubsetName: stable

      # 자동 분석 (Prometheus 메트릭 기반)
      analysis:
        templates:
        - templateName: error-rate-analysis
        - templateName: response-time-analysis
        args:
        - name: service-name
          value: myapp
        - name: namespace
          value: myapp-production

      # 자동 롤백 조건
      scaleDownDelaySeconds: 30

  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myregistry/myapp:latest
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080
          initialDelaySeconds: 10
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30

---
# environments/production/analysis-template.yaml - 분석 템플릿
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: error-rate-analysis
  namespace: myapp-production
spec:
  args:
  - name: service-name
  - name: namespace
  metrics:
  - name: error-rate
    interval: 1m
    count: 5
    successCondition: result[0] < 0.05  # 5% 미만 에러율
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus.monitoring.svc.cluster.local:9090
        query: |
          sum(rate(http_requests_total{job="{{args.service-name}}",namespace="{{args.namespace}}",code!~"2.."}[1m])) /
          sum(rate(http_requests_total{job="{{args.service-name}}",namespace="{{args.namespace}}"}[1m]))

---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: response-time-analysis
  namespace: myapp-production
spec:
  args:
  - name: service-name
  - name: namespace
  metrics:
  - name: avg-response-time
    interval: 1m
    count: 5
    successCondition: result[0] < 0.5  # 500ms 미만 응답시간
    failureLimit: 3
    provider:
      prometheus:
        address: http://prometheus.monitoring.svc.cluster.local:9090
        query: |
          histogram_quantile(0.95,
            sum(rate(http_request_duration_seconds_bucket{job="{{args.service-name}}",namespace="{{args.namespace}}"}[1m]))
            by (le)
          )

다중 환경 및 테넌트 관리

# apps/app-of-apps.yaml - ArgoCD App of Apps 패턴
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: app-of-apps
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/company/platform-config
    targetRevision: HEAD
    path: applications
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

---
# applications/tenants/tenant-a/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- ../../../base/tenant

# 테넌트별 설정
patchesStrategicMerge:
- tenant-config.yaml

# 네임스페이스 접두사
namespace: tenant-a

# 리소스 이름 접두사
namePrefix: tenant-a-

# 라벨 추가
commonLabels:
  tenant: tenant-a
  environment: production

# 설정 값 치환
replacements:
- source:
    kind: ConfigMap
    name: tenant-config
    fieldPath: data.database_url
  targets:
  - select:
      kind: Deployment
    fieldPaths:
    - spec.template.spec.containers.[name=app].env.[name=DATABASE_URL].value

---
# applications/tenants/tenant-a/tenant-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: tenant-config
data:
  tenant_id: "tenant-a"
  database_url: "postgres://tenant-a-db:5432/app"
  redis_url: "redis://tenant-a-redis:6379"
  storage_bucket: "tenant-a-storage"
  max_users: "1000"
  feature_flags: |
    advanced_analytics: true
    beta_features: false
    custom_branding: true

시크릿 관리와 보안

# External Secrets Operator 설정
# external-secrets/secret-store.yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-secret-store
  namespace: myapp-production
spec:
  provider:
    vault:
      server: "https://vault.company.com"
      path: "kv"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "myapp-production"
          secretRef:
            name: vault-auth-secret
            key: token

---
# external-secrets/external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: myapp-secrets
  namespace: myapp-production
spec:
  refreshInterval: 15m
  secretStoreRef:
    name: vault-secret-store
    kind: SecretStore
  target:
    name: myapp-secrets
    creationPolicy: Owner
    template:
      type: Opaque
      data:
        database-password: "{{ .database_password | toString }}"
        api-key: "{{ .api_key | toString }}"
        jwt-secret: "{{ .jwt_secret | toString }}"
  data:
  - secretKey: database_password
    remoteRef:
      key: myapp/production
      property: database_password
  - secretKey: api_key
    remoteRef:
      key: myapp/production
      property: api_key
  - secretKey: jwt_secret
    remoteRef:
      key: myapp/production
      property: jwt_secret

---
# Sealed Secrets 사용 시
# sealed-secrets/sealed-secret.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: myapp-secrets
  namespace: myapp-production
spec:
  encryptedData:
    database-password: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEQAx...
    api-key: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEQAx...
  template:
    metadata:
      name: myapp-secrets
      namespace: myapp-production
    type: Opaque

모니터링과 옵저버빌리티

GitOps 메트릭 수집

# monitoring/gitops-metrics.yaml
apiVersion: v1
kind: ServiceMonitor
metadata:
  name: argocd-metrics
  namespace: argocd
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: argocd-server-metrics
  endpoints:
  - port: metrics

---
# Grafana Dashboard를 위한 PrometheusRule
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: gitops-alerts
  namespace: argocd
spec:
  groups:
  - name: gitops.rules
    rules:
    - alert: ArgocdAppSyncFailed
      expr: argocd_app_health_status{health_status!="Healthy"} > 0
      for: 5m
      labels:
        severity: warning
        team: platform
      annotations:
        summary: "ArgoCD application {{ $labels.name }} sync failed"
        description: "ArgoCD application {{ $labels.name }} in namespace {{ $labels.namespace }} has been in unhealthy state for more than 5 minutes."

    - alert: ArgocdAppOutOfSync
      expr: argocd_app_sync_total{sync_status!="Synced"} > 0
      for: 10m
      labels:
        severity: info
        team: platform
      annotations:
        summary: "ArgoCD application {{ $labels.name }} is out of sync"
        description: "ArgoCD application {{ $labels.name }} has been out of sync for more than 10 minutes."

    - alert: FluxKustomizationFailed
      expr: gotk_reconcile_condition{type="Ready",status="False",kind="Kustomization"} > 0
      for: 5m
      labels:
        severity: critical
        team: platform
      annotations:
        summary: "Flux Kustomization {{ $labels.name }} failed"
        description: "Flux Kustomization {{ $labels.name }} in namespace {{ $labels.namespace }} has failed to reconcile."

---
# Grafana Dashboard ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: gitops-dashboard
  namespace: monitoring
  labels:
    grafana_dashboard: "1"
data:
  gitops-dashboard.json: |
    {
      "dashboard": {
        "title": "GitOps Overview",
        "panels": [
          {
            "title": "Application Health Status",
            "type": "stat",
            "targets": [
              {
                "expr": "count(argocd_app_health_status{health_status=\"Healthy\"})",
                "legendFormat": "Healthy Apps"
              },
              {
                "expr": "count(argocd_app_health_status{health_status!=\"Healthy\"})",
                "legendFormat": "Unhealthy Apps"
              }
            ]
          },
          {
            "title": "Sync Status",
            "type": "pie",
            "targets": [
              {
                "expr": "count by (sync_status) (argocd_app_sync_total)",
                "legendFormat": "{{ sync_status }}"
              }
            ]
          },
          {
            "title": "Deployment Frequency",
            "type": "graph",
            "targets": [
              {
                "expr": "rate(argocd_app_sync_total[1h]) * 3600",
                "legendFormat": "Syncs per hour"
              }
            ]
          }
        ]
      }
    }

트러블슈팅과 디버깅

일반적인 문제와 해결책

#!/bin/bash
# scripts/gitops-troubleshoot.sh

# GitOps 트러블슈팅 도구

check_argocd_health() {
    echo "=== ArgoCD Health Check ==="

    # ArgoCD 서버 상태
    kubectl get pods -n argocd | grep argocd-server

    # 애플리케이션 상태 확인
    argocd app list

    # 특정 앱 상세 상태
    local app_name=${1:-myapp-production}
    argocd app get "$app_name" --show-params

    # 이벤트 로그 확인
    kubectl get events -n argocd --sort-by='.firstTimestamp'
}

check_flux_health() {
    echo "=== Flux Health Check ==="

    # Flux 컨트롤러 상태
    flux get all

    # Kustomization 상태
    flux get kustomizations

    # GitRepository 상태
    flux get sources git

    # 최근 조정 로그
    flux logs --follow --tail=50
}

debug_sync_issues() {
    local app_name=${1:-myapp-production}

    echo "=== Debugging Sync Issues for $app_name ==="

    # Git 저장소 접근 테스트
    argocd app get "$app_name" --refresh

    # 매니페스트 비교
    argocd app diff "$app_name"

    # 강제 재동기화
    read -p "Force sync? (y/N): " -n 1 -r
    echo
    if [[ $REPLY =~ ^[Yy]$ ]]; then
        argocd app sync "$app_name" --force
    fi
}

check_rbac_permissions() {
    echo "=== RBAC Permission Check ==="

    # ArgoCD 서비스 계정 권한
    kubectl auth can-i get applications --as=system:serviceaccount:argocd:argocd-server
    kubectl auth can-i create deployments --as=system:serviceaccount:argocd:argocd-server

    # Flux 서비스 계정 권한
    kubectl auth can-i get kustomizations --as=system:serviceaccount:flux-system:kustomize-controller
    kubectl auth can-i patch deployments --as=system:serviceaccount:flux-system:kustomize-controller
}

analyze_git_issues() {
    echo "=== Git Access Analysis ==="

    # Git 자격증명 확인
    kubectl get secrets -n argocd | grep git
    kubectl get secrets -n flux-system | grep git

    # Repository 연결 테스트
    local repo_url=${1:-"https://github.com/company/myapp-config"}

    echo "Testing Git repository access: $repo_url"
    git ls-remote "$repo_url" | head -5
}

performance_analysis() {
    echo "=== GitOps Performance Analysis ==="

    # ArgoCD 메트릭
    curl -s "http://argocd-server-metrics.argocd.svc.cluster.local:8083/metrics" | grep argocd_app_sync_total

    # Flux 메트릭
    curl -s "http://kustomize-controller.flux-system.svc.cluster.local:8080/metrics" | grep gotk_reconcile_duration

    # 리소스 사용량
    kubectl top pods -n argocd
    kubectl top pods -n flux-system
}

network_connectivity_check() {
    echo "=== Network Connectivity Check ==="

    # DNS 해결
    nslookup github.com
    nslookup api.github.com

    # 외부 접근 테스트
    kubectl run test-pod --image=curlimages/curl:latest --rm -it -- /bin/sh
    # 컨테이너 내에서: curl -I https://github.com
}

generate_report() {
    local output_file="gitops-health-report-$(date +%Y%m%d-%H%M%S).txt"

    echo "Generating GitOps health report: $output_file"

    {
        echo "GitOps Health Report - $(date)"
        echo "=================================="
        echo

        check_argocd_health
        echo

        check_flux_health
        echo

        check_rbac_permissions
        echo

        performance_analysis

    } > "$output_file"

    echo "Report saved to: $output_file"
}

# 메인 실행
main() {
    case "${1:-}" in
        "argocd")
            check_argocd_health "$2"
            ;;
        "flux")
            check_flux_health
            ;;
        "debug")
            debug_sync_issues "$2"
            ;;
        "rbac")
            check_rbac_permissions
            ;;
        "git")
            analyze_git_issues "$2"
            ;;
        "perf")
            performance_analysis
            ;;
        "network")
            network_connectivity_check
            ;;
        "report")
            generate_report
            ;;
        *)
            echo "Usage: $0 {argocd|flux|debug|rbac|git|perf|network|report} [app-name|repo-url]"
            echo
            echo "Commands:"
            echo "  argocd [app]     - Check ArgoCD health for specific app"
            echo "  flux             - Check Flux health"
            echo "  debug [app]      - Debug sync issues"
            echo "  rbac             - Check RBAC permissions"
            echo "  git [repo]       - Analyze Git connectivity"
            echo "  perf             - Performance analysis"
            echo "  network          - Network connectivity check"
            echo "  report           - Generate comprehensive report"
            exit 1
            ;;
    esac
}

main "$@"

결론: GitOps로 안전하고 효율적인 배포 문화 구축

GitOps는 2026년 현재 Kubernetes 환경에서의 사실상 표준 배포 방법론으로 자리잡았습니다. ArgoCD와 Flux로 대표되는 도구들이 성숙해지면서, 기업들은 더 안전하고 추적 가능하며 자동화된 배포 환경을 구축할 수 있게 되었습니다.

GitOps 도입의 핵심 이점:

  1. 선언적 배포: Git을 통한 모든 변경사항의 명시적 관리
  2. 자동화와 일관성: 환경 간 배포 프로세스의 표준화
  3. 보안 강화: Pull 기반 접근법으로 클러스터 접근 권한 최소화
  4. 협업 개선: 코드 리뷰 프로세스를 통한 배포 품질 관리

성공적인 GitOps 구축을 위한 핵심 요소:

  • 점진적 도입: 작은 애플리케이션부터 시작해서 단계적 확산
  • 팀 교육: GitOps 개념과 도구 사용법에 대한 충분한 이해
  • 모니터링 체계: 배포 상태와 성능에 대한 지속적인 관찰
  • 보안 정책: 시크릿 관리와 접근 제어의 체계적 구축

GitOps는 단순한 배포 자동화를 넘어, 개발팀과 운영팀이 함께 만들어가는 새로운 협업 문화입니다. 올바르게 구현된 GitOps 환경에서는 배포가 더 이상 두려운 일이 아니라 일상적이고 안전한 업무 프로세스가 됩니다.

궁금한 점이 있으신가요?

협업·의뢰는 아래로, 가벼운 소통은 인스타그램 @bluefox._.hi도 환영이에요.