GitOps의 실전 적용: ArgoCD, Flux로 구축하는 선언적 배포 시스템
2026년 현재, GitOps는 Kubernetes 환경에서의 표준 배포 방법론으로 자리잡았습니다. "Git을 Single Source of Truth로 하는 운영 패러다임"으로 정의되는 GitOps는 전통적인 Push 기반 배포의 문제점들을 해결하고, 더 안전하고 추적 가능한 배포 환경을 제공합니다.
CNCF 2025 설문조사에 따르면, Kubernetes를 사용하는 기업의 73%가 GitOps 방법론을 채택했으며, ArgoCD와 Flux가 각각 42%, 35%의 점유율로 양대 산맥을 이루고 있습니다. 이들이 어떻게 현대적인 배포 인프라의 핵심이 되었는지, 실전에서 어떻게 구현하는지 살펴보겠습니다.
GitOps의 핵심 개념과 원칙
전통적 배포 방식의 문제점
# 전통적인 Push 기반 배포 (문제가 많은 방식)
# CI/CD 파이프라인이 직접 클러스터에 배포
# CI 파이프라인에서
kubectl apply -f deployment.yaml
kubectl set image deployment/myapp myapp:v1.2.3
kubectl rollout status deployment/myapp
# 문제점들:
# 1. 클러스터에 대한 직접 접근 권한 필요
# 2. 배포 권한이 너무 넓게 분산됨
# 3. 배포 상태와 Git의 불일치 가능성
# 4. 롤백 시 이전 상태를 정확히 알기 어려움
# 5. 멀티 클러스터 관리의 복잡성
GitOps의 Pull 기반 접근법
# GitOps 방식: Git Repository 구조 예시
# repo: myapp-config
├── environments/
│ ├── dev/
│ │ ├── kustomization.yaml
│ │ └── values.yaml
│ ├── staging/
│ │ ├── kustomization.yaml
│ │ └── values.yaml
│ └── production/
│ ├── kustomization.yaml
│ └── values.yaml
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── ingress.yaml
│ └── kustomization.yaml
└── .argocd/
└── application.yaml
# base/deployment.yaml - 기본 애플리케이션 정의
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
labels:
app: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myregistry/myapp:latest
ports:
- containerPort: 8080
env:
- name: ENV
value: "production"
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
---
# base/service.yaml
apiVersion: v1
kind: Service
metadata:
name: myapp-service
labels:
app: myapp
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: ClusterIP
GitOps의 4대 원칙
- 선언적(Declarative): 시스템의 원하는 상태를 Git에 선언
- 버전 관리(Versioned): 모든 변경사항이 Git에 추적됨
- 자동 적용(Automatically Applied): Git 변경 시 자동으로 클러스터에 반영
- 지속적 조정(Continuously Reconciled): 실제 상태와 원하는 상태를 지속적으로 동기화
ArgoCD: Kubernetes Native GitOps
ArgoCD 설치와 초기 설정
# ArgoCD 설치
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
# ArgoCD CLI 설치 (macOS)
brew install argocd
# ArgoCD 서버에 접근하기 위한 포트 포워딩
kubectl port-forward svc/argocd-server -n argocd 8080:443
# 초기 admin 패스워드 확인
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d
# CLI 로그인
argocd login localhost:8080 --username admin --password <password> --insecure
ArgoCD Application 정의
# .argocd/application.yaml - ArgoCD Application 리소스
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-production
namespace: argocd
labels:
app.kubernetes.io/name: myapp
app.kubernetes.io/env: production
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
# 프로젝트 설정
project: default
# 소스 Git 리포지토리
source:
repoURL: https://github.com/company/myapp-config
targetRevision: HEAD
path: environments/production
# Kustomize 사용 시
kustomize:
images:
- myregistry/myapp:v1.2.3
# Helm 사용 시 (선택사항)
# helm:
# valueFiles:
# - values.yaml
# parameters:
# - name: image.tag
# value: v1.2.3
# 대상 클러스터
destination:
server: https://kubernetes.default.svc
namespace: myapp-production
# 동기화 정책
syncPolicy:
automated:
prune: true # 삭제된 리소스 자동 정리
selfHeal: true # 드리프트 자동 수정
allowEmpty: false
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
# 동기화 재시도 정책
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
# 상태 확인 정책
ignoreDifferences:
- group: apps
kind: Deployment
jsonPointers:
- /spec/replicas # HPA에 의한 replica 변경 무시
# 리소스 제외 설정
info:
- name: 'Environment'
value: 'Production'
- name: 'Team'
value: 'Backend'
고급 ArgoCD 설정
# config/argocd-cm.yaml - ArgoCD 설정 맵
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
labels:
app.kubernetes.io/name: argocd-cm
app.kubernetes.io/part-of: argocd
data:
# Git 리포지토리 설정
repositories: |
- type: git
url: https://github.com/company/myapp-config
name: myapp-config
- type: helm
url: https://charts.bitnami.com/bitnami
name: bitnami
# SSO 설정 (OIDC)
oidc.config: |
name: Google
issuer: https://accounts.google.com
clientId: $oidc.google.clientId
clientSecret: $oidc.google.clientSecret
requestedScopes: ["openid", "profile", "email"]
requestedIDTokenClaims: {"groups": {"essential": true}}
# 리소스 헬스 체크 커스터마이징
resource.customizations.health.argoproj.io_Rollout: |
hs = {}
if obj.status ~= nil then
if obj.status.replicas ~= nil and obj.status.updatedReplicas ~= nil and obj.status.readyReplicas ~= nil and obj.status.availableReplicas ~= nil then
if obj.status.replicas == obj.status.updatedReplicas and obj.status.replicas == obj.status.readyReplicas and obj.status.replicas == obj.status.availableReplicas then
hs.status = "Healthy"
hs.message = "Rollout is healthy"
return hs
end
end
end
hs.status = "Progressing"
hs.message = "Waiting for rollout to finish"
return hs
# 애플리케이션 템플릿
application.instanceLabelKey: argocd.argoproj.io/instance
---
# config/argocd-rbac-cm.yaml - RBAC 설정
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-rbac-cm
namespace: argocd
labels:
app.kubernetes.io/name: argocd-rbac-cm
app.kubernetes.io/part-of: argocd
data:
policy.default: role:readonly
policy.csv: |
# 개발팀 권한
p, role:developer, applications, get, */dev-*, allow
p, role:developer, applications, sync, */dev-*, allow
p, role:developer, repositories, get, *, allow
p, role:developer, logs, get, */dev-*, allow
# 운영팀 권한
p, role:ops, applications, *, *, allow
p, role:ops, clusters, *, *, allow
p, role:ops, repositories, *, *, allow
# 그룹 매핑
g, company:developers, role:developer
g, company:ops, role:ops
g, admin@company.com, role:admin
다중 클러스터 관리
# 외부 클러스터 추가
argocd cluster add my-staging-cluster --name staging
argocd cluster add my-production-cluster --name production
# 클러스터별 Application 설정
# staging-cluster-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-staging
namespace: argocd
spec:
source:
repoURL: https://github.com/company/myapp-config
targetRevision: HEAD
path: environments/staging
destination:
name: staging # 클러스터 이름으로 지정
namespace: myapp-staging
syncPolicy:
automated:
prune: true
selfHeal: true
---
# production-cluster-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-production
namespace: argocd
spec:
source:
repoURL: https://github.com/company/myapp-config
targetRevision: main # 프로덕션은 main 브랜치만
path: environments/production
destination:
name: production
namespace: myapp-production
syncPolicy:
# 프로덕션은 수동 동기화
syncOptions:
- CreateNamespace=true
retry:
limit: 2
backoff:
duration: 10s
factor: 2
maxDuration: 5m
Flux: GitOps Toolkit의 새로운 표준
Flux v2 설치와 설정
# Flux CLI 설치
curl -s https://fluxcd.io/install.sh | sudo bash
# 클러스터 사전 체크
flux check --pre
# GitHub 토큰 설정 (Personal Access Token 필요)
export GITHUB_TOKEN=<your-token>
export GITHUB_USER=<your-username>
export GITHUB_REPO=fleet-infra
# Flux 부트스트랩
flux bootstrap github \
--owner=$GITHUB_USER \
--repository=$GITHUB_REPO \
--branch=main \
--path=./clusters/production \
--personal
Flux GitRepository와 Kustomization
# clusters/production/myapp-source.yaml - GitRepository 리소스
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: GitRepository
metadata:
name: myapp-config
namespace: flux-system
spec:
interval: 5m
url: https://github.com/company/myapp-config
ref:
branch: main
secretRef:
name: myapp-git-credentials
---
# clusters/production/myapp-kustomization.yaml - Kustomization 리소스
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: myapp-production
namespace: flux-system
spec:
interval: 10m
sourceRef:
kind: GitRepository
name: myapp-config
path: "./environments/production"
prune: true
wait: true
timeout: 5m
# 헬스 체크 설정
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: myapp
namespace: myapp-production
# 사전/사후 훅
dependsOn:
- name: myapp-secrets
- name: myapp-configmaps
# 알림 설정
postBuild:
substitute:
cluster_name: "production"
cluster_region: "us-west-2"
Flux Helm Controller 활용
# clusters/production/helm-repos.yaml - Helm Repository
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: bitnami
namespace: flux-system
spec:
interval: 24h
url: https://charts.bitnami.com/bitnami
---
# clusters/production/redis-release.yaml - Helm Release
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: redis
namespace: myapp-production
spec:
interval: 15m
chart:
spec:
chart: redis
version: "17.3.7"
sourceRef:
kind: HelmRepository
name: bitnami
namespace: flux-system
# 값 재정의
values:
auth:
enabled: true
password: "${redis_password}"
master:
persistence:
enabled: true
size: 8Gi
replica:
replicaCount: 2
persistence:
enabled: true
size: 8Gi
# 업그레이드 정책
upgrade:
remediation:
retries: 3
rollback:
cleanupOnFail: true
force: true
# 테스트 설정
test:
enable: true
timeout: 2m
# 의존성
dependsOn:
- name: cert-manager
namespace: cert-manager
Flux 알림 시스템
# clusters/production/notifications.yaml - 알림 설정
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Provider
metadata:
name: slack
namespace: flux-system
spec:
type: slack
channel: "#deployments"
secretRef:
name: slack-webhook-secret
---
apiVersion: notification.toolkit.fluxcd.io/v1beta1
kind: Alert
metadata:
name: production-alerts
namespace: flux-system
spec:
providerRef:
name: slack
eventSeverity: info
eventSources:
- kind: Kustomization
name: myapp-production
- kind: HelmRelease
name: redis
summary: |
Production deployment status:
- Cluster: {{ .ExternalURL }}
- Commit: {{ .Revision }}
- Status: {{ .Reason }}
---
# Webhook을 위한 Secret
apiVersion: v1
kind: Secret
metadata:
name: slack-webhook-secret
namespace: flux-system
data:
address: <base64-encoded-webhook-url>
실전 GitOps 워크플로우
개발에서 배포까지의 전체 플로우
#!/bin/bash
# scripts/deploy-pipeline.sh - 완전한 GitOps 파이프라인
set -euo pipefail
# 환경 변수
APP_NAME="myapp"
IMAGE_TAG="${GITHUB_SHA:0:7}"
CONFIG_REPO="https://github.com/company/myapp-config"
ENVIRONMENTS=("dev" "staging" "production")
# 1. 애플리케이션 빌드 및 이미지 푸시
build_and_push() {
echo "Building and pushing image..."
docker build -t "myregistry/${APP_NAME}:${IMAGE_TAG}" .
docker push "myregistry/${APP_NAME}:${IMAGE_TAG}"
# 보안 스캔
docker scout cves "myregistry/${APP_NAME}:${IMAGE_TAG}"
# latest 태그 업데이트
docker tag "myregistry/${APP_NAME}:${IMAGE_TAG}" "myregistry/${APP_NAME}:latest"
docker push "myregistry/${APP_NAME}:latest"
}
# 2. Kustomize 이미지 업데이트
update_kustomization() {
local env=$1
local image_tag=$2
echo "Updating ${env} environment with image tag: ${image_tag}"
cd config-repo
# Kustomize로 이미지 태그 업데이트
kustomize edit set image "myregistry/${APP_NAME}=${image_tag}" \
-f "environments/${env}/kustomization.yaml"
# 변경사항 커밋
git add .
git commit -m "Update ${env} image to ${image_tag}
- Image: myregistry/${APP_NAME}:${image_tag}
- Commit: ${GITHUB_SHA}
- Author: ${GITHUB_ACTOR}
- Ref: ${GITHUB_REF}"
git push origin main
}
# 3. 배포 상태 확인
wait_for_deployment() {
local env=$1
local timeout=600 # 10분
echo "Waiting for ${env} deployment to complete..."
# ArgoCD 사용 시
if command -v argocd &> /dev/null; then
argocd app wait "${APP_NAME}-${env}" --timeout ${timeout}
argocd app get "${APP_NAME}-${env}" --show-params
fi
# Flux 사용 시
if command -v flux &> /dev/null; then
flux get kustomizations "${APP_NAME}-${env}" --watch-timeout=${timeout}s
fi
# 헬스 체크
kubectl rollout status "deployment/${APP_NAME}" -n "${APP_NAME}-${env}" --timeout=${timeout}s
}
# 4. 자동화된 테스트
run_post_deployment_tests() {
local env=$1
echo "Running post-deployment tests for ${env}..."
# 서비스 헬스 체크
kubectl wait --for=condition=Available deployment/${APP_NAME} -n "${APP_NAME}-${env}" --timeout=300s
# 애플리케이션별 헬스 체크
if [[ "${env}" == "staging" || "${env}" == "production" ]]; then
# E2E 테스트 실행
npm run test:e2e -- --env="${env}"
# 성능 테스트
npm run test:performance -- --env="${env}"
fi
}
# 5. 프로모션 프로세스
promote_to_next_env() {
local current_env=$1
local next_env=$2
local image_tag=$3
echo "Promoting from ${current_env} to ${next_env}..."
# 프로덕션으로의 프로모션은 승인 필요
if [[ "${next_env}" == "production" ]]; then
echo "Production promotion requires manual approval"
# GitHub PR 생성 또는 승인 시스템 연동
create_promotion_pr "${current_env}" "${next_env}" "${image_tag}"
else
# 자동 프로모션
update_kustomization "${next_env}" "${image_tag}"
wait_for_deployment "${next_env}"
run_post_deployment_tests "${next_env}"
fi
}
# 6. PR 기반 프로덕션 배포
create_promotion_pr() {
local source_env=$1
local target_env=$2
local image_tag=$3
# 새 브랜치 생성
git checkout -b "promote-${target_env}-${IMAGE_TAG}"
# 이미지 태그 업데이트
kustomize edit set image "myregistry/${APP_NAME}=${image_tag}" \
-f "environments/${target_env}/kustomization.yaml"
git add .
git commit -m "Promote to ${target_env}: ${image_tag}"
git push origin "promote-${target_env}-${IMAGE_TAG}"
# GitHub PR 생성 (gh CLI 사용)
gh pr create \
--title "🚀 Promote to ${target_env}: ${image_tag}" \
--body "Promoting image \`${image_tag}\` from ${source_env} to ${target_env}
## Changes
- Image: \`myregistry/${APP_NAME}:${image_tag}\`
- Source: ${source_env}
- Target: ${target_env}
## Validation
- [ ] ${source_env} tests passed
- [ ] Security scan completed
- [ ] Performance benchmarks met
/cc @ops-team" \
--assignee "${GITHUB_ACTOR}" \
--label "deployment,${target_env}"
}
# 메인 실행 흐름
main() {
case "${1:-}" in
"build")
build_and_push
;;
"deploy")
local env="${2:-dev}"
update_kustomization "${env}" "myregistry/${APP_NAME}:${IMAGE_TAG}"
wait_for_deployment "${env}"
run_post_deployment_tests "${env}"
;;
"promote")
local from_env="${2:-staging}"
local to_env="${3:-production}"
promote_to_next_env "${from_env}" "${to_env}" "myregistry/${APP_NAME}:${IMAGE_TAG}"
;;
"full-pipeline")
build_and_push
# Dev 환경 배포
update_kustomization "dev" "myregistry/${APP_NAME}:${IMAGE_TAG}"
wait_for_deployment "dev"
run_post_deployment_tests "dev"
# Staging 프로모션
promote_to_next_env "dev" "staging" "myregistry/${APP_NAME}:${IMAGE_TAG}"
# Production 프로모션 (수동 승인 필요)
promote_to_next_env "staging" "production" "myregistry/${APP_NAME}:${IMAGE_TAG}"
;;
*)
echo "Usage: $0 {build|deploy|promote|full-pipeline}"
exit 1
;;
esac
}
# 스크립트 실행
main "$@"
GitHub Actions와 GitOps 통합
# .github/workflows/gitops-pipeline.yml
name: GitOps Pipeline
on:
push:
branches: [main]
paths-ignore:
- 'README.md'
- 'docs/**'
pull_request:
branches: [main]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-test:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
image-digest: ${{ steps.build.outputs.digest }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix={{branch}}-
type=sha,prefix={{branch}}-{{date 'YYYYMMDD'}}-
- name: Build and push Docker image
id: build
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
if: always()
with:
sarif_file: 'trivy-results.sarif'
deploy-dev:
needs: build-and-test
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout config repository
uses: actions/checkout@v4
with:
repository: company/myapp-config
token: ${{ secrets.GITOPS_TOKEN }}
path: config-repo
- name: Update dev environment
run: |
cd config-repo
# Kustomize 설치
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
# 이미지 태그 업데이트
./kustomize edit set image \
myapp=${{ needs.build-and-test.outputs.image-tag }} \
--file environments/dev/kustomization.yaml
# Git 설정
git config user.name "GitOps Bot"
git config user.email "gitops@company.com"
# 변경사항 커밋
git add .
git commit -m "🤖 Update dev image to ${{ github.sha }}
- Image: ${{ needs.build-and-test.outputs.image-tag }}
- Commit: ${{ github.sha }}
- Author: ${{ github.actor }}
- Workflow: ${{ github.run_id }}"
git push origin main
deploy-staging:
needs: [build-and-test, deploy-dev]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: staging
steps:
- name: Wait for dev deployment
run: |
# ArgoCD CLI 또는 kubectl을 사용한 배포 대기
echo "Waiting for dev deployment to stabilize..."
sleep 60 # 실제로는 ArgoCD API 체크
- name: Run integration tests
run: |
# Dev 환경에 대한 통합 테스트
curl -f http://myapp-dev.company.com/health || exit 1
- name: Update staging environment
# dev와 동일한 패턴으로 staging 업데이트
run: echo "Updating staging..."
promote-production:
needs: [build-and-test, deploy-staging]
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: production
steps:
- name: Create production promotion PR
uses: actions/github-script@v7
with:
github-token: ${{ secrets.GITOPS_TOKEN }}
script: |
const { owner, repo } = context.repo;
// PR 생성
const pr = await github.rest.pulls.create({
owner: 'company',
repo: 'myapp-config',
title: `🚀 Promote to production: ${context.sha.substring(0, 7)}`,
head: `promote-prod-${context.sha.substring(0, 7)}`,
base: 'main',
body: `
## Production Promotion
Promoting image from staging to production
**Image:** \`${{ needs.build-and-test.outputs.image-tag }}\`
**Commit:** ${context.sha}
**Author:** ${context.actor}
## Pre-deployment Checklist
- [x] All tests passed
- [x] Security scan completed
- [x] Staging validation successful
- [ ] Load test completed
- [ ] Security team approval
- [ ] SRE team approval
## Rollback Plan
Previous image: \`$(git log --oneline -n 1 environments/production/)\`
cc: @sre-team @security-team
`
});
// 리뷰어 할당
await github.rest.pulls.requestReviewers({
owner: 'company',
repo: 'myapp-config',
pull_number: pr.data.number,
team_reviewers: ['sre-team', 'security-team']
});
고급 GitOps 패턴과 모범 사례
Progressive Delivery with Argo Rollouts
# environments/production/rollout.yaml - 카나리 배포 설정
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
namespace: myapp-production
spec:
replicas: 10
strategy:
canary:
# 단계별 트래픽 증가
steps:
- setWeight: 20 # 20% 트래픽
- pause: {} # 수동 승인 대기
- setWeight: 40 # 40% 트래픽
- pause: {duration: 10s}
- setWeight: 60 # 60% 트래픽
- pause: {duration: 10s}
- setWeight: 80 # 80% 트래픽
- pause: {duration: 10s}
# 트래픽 라우팅 (Istio)
trafficRouting:
istio:
virtualService:
name: myapp-vs
destinationRule:
name: myapp-dr
canarySubsetName: canary
stableSubsetName: stable
# 자동 분석 (Prometheus 메트릭 기반)
analysis:
templates:
- templateName: error-rate-analysis
- templateName: response-time-analysis
args:
- name: service-name
value: myapp
- name: namespace
value: myapp-production
# 자동 롤백 조건
scaleDownDelaySeconds: 30
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myregistry/myapp:latest
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
---
# environments/production/analysis-template.yaml - 분석 템플릿
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: error-rate-analysis
namespace: myapp-production
spec:
args:
- name: service-name
- name: namespace
metrics:
- name: error-rate
interval: 1m
count: 5
successCondition: result[0] < 0.05 # 5% 미만 에러율
failureLimit: 3
provider:
prometheus:
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
sum(rate(http_requests_total{job="{{args.service-name}}",namespace="{{args.namespace}}",code!~"2.."}[1m])) /
sum(rate(http_requests_total{job="{{args.service-name}}",namespace="{{args.namespace}}"}[1m]))
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: response-time-analysis
namespace: myapp-production
spec:
args:
- name: service-name
- name: namespace
metrics:
- name: avg-response-time
interval: 1m
count: 5
successCondition: result[0] < 0.5 # 500ms 미만 응답시간
failureLimit: 3
provider:
prometheus:
address: http://prometheus.monitoring.svc.cluster.local:9090
query: |
histogram_quantile(0.95,
sum(rate(http_request_duration_seconds_bucket{job="{{args.service-name}}",namespace="{{args.namespace}}"}[1m]))
by (le)
)
다중 환경 및 테넌트 관리
# apps/app-of-apps.yaml - ArgoCD App of Apps 패턴
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: app-of-apps
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/company/platform-config
targetRevision: HEAD
path: applications
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
selfHeal: true
---
# applications/tenants/tenant-a/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../../../base/tenant
# 테넌트별 설정
patchesStrategicMerge:
- tenant-config.yaml
# 네임스페이스 접두사
namespace: tenant-a
# 리소스 이름 접두사
namePrefix: tenant-a-
# 라벨 추가
commonLabels:
tenant: tenant-a
environment: production
# 설정 값 치환
replacements:
- source:
kind: ConfigMap
name: tenant-config
fieldPath: data.database_url
targets:
- select:
kind: Deployment
fieldPaths:
- spec.template.spec.containers.[name=app].env.[name=DATABASE_URL].value
---
# applications/tenants/tenant-a/tenant-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: tenant-config
data:
tenant_id: "tenant-a"
database_url: "postgres://tenant-a-db:5432/app"
redis_url: "redis://tenant-a-redis:6379"
storage_bucket: "tenant-a-storage"
max_users: "1000"
feature_flags: |
advanced_analytics: true
beta_features: false
custom_branding: true
시크릿 관리와 보안
# External Secrets Operator 설정
# external-secrets/secret-store.yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: vault-secret-store
namespace: myapp-production
spec:
provider:
vault:
server: "https://vault.company.com"
path: "kv"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "myapp-production"
secretRef:
name: vault-auth-secret
key: token
---
# external-secrets/external-secret.yaml
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secrets
namespace: myapp-production
spec:
refreshInterval: 15m
secretStoreRef:
name: vault-secret-store
kind: SecretStore
target:
name: myapp-secrets
creationPolicy: Owner
template:
type: Opaque
data:
database-password: "{{ .database_password | toString }}"
api-key: "{{ .api_key | toString }}"
jwt-secret: "{{ .jwt_secret | toString }}"
data:
- secretKey: database_password
remoteRef:
key: myapp/production
property: database_password
- secretKey: api_key
remoteRef:
key: myapp/production
property: api_key
- secretKey: jwt_secret
remoteRef:
key: myapp/production
property: jwt_secret
---
# Sealed Secrets 사용 시
# sealed-secrets/sealed-secret.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: myapp-secrets
namespace: myapp-production
spec:
encryptedData:
database-password: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEQAx...
api-key: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEQAx...
template:
metadata:
name: myapp-secrets
namespace: myapp-production
type: Opaque
모니터링과 옵저버빌리티
GitOps 메트릭 수집
# monitoring/gitops-metrics.yaml
apiVersion: v1
kind: ServiceMonitor
metadata:
name: argocd-metrics
namespace: argocd
spec:
selector:
matchLabels:
app.kubernetes.io/name: argocd-server-metrics
endpoints:
- port: metrics
---
# Grafana Dashboard를 위한 PrometheusRule
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: gitops-alerts
namespace: argocd
spec:
groups:
- name: gitops.rules
rules:
- alert: ArgocdAppSyncFailed
expr: argocd_app_health_status{health_status!="Healthy"} > 0
for: 5m
labels:
severity: warning
team: platform
annotations:
summary: "ArgoCD application {{ $labels.name }} sync failed"
description: "ArgoCD application {{ $labels.name }} in namespace {{ $labels.namespace }} has been in unhealthy state for more than 5 minutes."
- alert: ArgocdAppOutOfSync
expr: argocd_app_sync_total{sync_status!="Synced"} > 0
for: 10m
labels:
severity: info
team: platform
annotations:
summary: "ArgoCD application {{ $labels.name }} is out of sync"
description: "ArgoCD application {{ $labels.name }} has been out of sync for more than 10 minutes."
- alert: FluxKustomizationFailed
expr: gotk_reconcile_condition{type="Ready",status="False",kind="Kustomization"} > 0
for: 5m
labels:
severity: critical
team: platform
annotations:
summary: "Flux Kustomization {{ $labels.name }} failed"
description: "Flux Kustomization {{ $labels.name }} in namespace {{ $labels.namespace }} has failed to reconcile."
---
# Grafana Dashboard ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: gitops-dashboard
namespace: monitoring
labels:
grafana_dashboard: "1"
data:
gitops-dashboard.json: |
{
"dashboard": {
"title": "GitOps Overview",
"panels": [
{
"title": "Application Health Status",
"type": "stat",
"targets": [
{
"expr": "count(argocd_app_health_status{health_status=\"Healthy\"})",
"legendFormat": "Healthy Apps"
},
{
"expr": "count(argocd_app_health_status{health_status!=\"Healthy\"})",
"legendFormat": "Unhealthy Apps"
}
]
},
{
"title": "Sync Status",
"type": "pie",
"targets": [
{
"expr": "count by (sync_status) (argocd_app_sync_total)",
"legendFormat": "{{ sync_status }}"
}
]
},
{
"title": "Deployment Frequency",
"type": "graph",
"targets": [
{
"expr": "rate(argocd_app_sync_total[1h]) * 3600",
"legendFormat": "Syncs per hour"
}
]
}
]
}
}
트러블슈팅과 디버깅
일반적인 문제와 해결책
#!/bin/bash
# scripts/gitops-troubleshoot.sh
# GitOps 트러블슈팅 도구
check_argocd_health() {
echo "=== ArgoCD Health Check ==="
# ArgoCD 서버 상태
kubectl get pods -n argocd | grep argocd-server
# 애플리케이션 상태 확인
argocd app list
# 특정 앱 상세 상태
local app_name=${1:-myapp-production}
argocd app get "$app_name" --show-params
# 이벤트 로그 확인
kubectl get events -n argocd --sort-by='.firstTimestamp'
}
check_flux_health() {
echo "=== Flux Health Check ==="
# Flux 컨트롤러 상태
flux get all
# Kustomization 상태
flux get kustomizations
# GitRepository 상태
flux get sources git
# 최근 조정 로그
flux logs --follow --tail=50
}
debug_sync_issues() {
local app_name=${1:-myapp-production}
echo "=== Debugging Sync Issues for $app_name ==="
# Git 저장소 접근 테스트
argocd app get "$app_name" --refresh
# 매니페스트 비교
argocd app diff "$app_name"
# 강제 재동기화
read -p "Force sync? (y/N): " -n 1 -r
echo
if [[ $REPLY =~ ^[Yy]$ ]]; then
argocd app sync "$app_name" --force
fi
}
check_rbac_permissions() {
echo "=== RBAC Permission Check ==="
# ArgoCD 서비스 계정 권한
kubectl auth can-i get applications --as=system:serviceaccount:argocd:argocd-server
kubectl auth can-i create deployments --as=system:serviceaccount:argocd:argocd-server
# Flux 서비스 계정 권한
kubectl auth can-i get kustomizations --as=system:serviceaccount:flux-system:kustomize-controller
kubectl auth can-i patch deployments --as=system:serviceaccount:flux-system:kustomize-controller
}
analyze_git_issues() {
echo "=== Git Access Analysis ==="
# Git 자격증명 확인
kubectl get secrets -n argocd | grep git
kubectl get secrets -n flux-system | grep git
# Repository 연결 테스트
local repo_url=${1:-"https://github.com/company/myapp-config"}
echo "Testing Git repository access: $repo_url"
git ls-remote "$repo_url" | head -5
}
performance_analysis() {
echo "=== GitOps Performance Analysis ==="
# ArgoCD 메트릭
curl -s "http://argocd-server-metrics.argocd.svc.cluster.local:8083/metrics" | grep argocd_app_sync_total
# Flux 메트릭
curl -s "http://kustomize-controller.flux-system.svc.cluster.local:8080/metrics" | grep gotk_reconcile_duration
# 리소스 사용량
kubectl top pods -n argocd
kubectl top pods -n flux-system
}
network_connectivity_check() {
echo "=== Network Connectivity Check ==="
# DNS 해결
nslookup github.com
nslookup api.github.com
# 외부 접근 테스트
kubectl run test-pod --image=curlimages/curl:latest --rm -it -- /bin/sh
# 컨테이너 내에서: curl -I https://github.com
}
generate_report() {
local output_file="gitops-health-report-$(date +%Y%m%d-%H%M%S).txt"
echo "Generating GitOps health report: $output_file"
{
echo "GitOps Health Report - $(date)"
echo "=================================="
echo
check_argocd_health
echo
check_flux_health
echo
check_rbac_permissions
echo
performance_analysis
} > "$output_file"
echo "Report saved to: $output_file"
}
# 메인 실행
main() {
case "${1:-}" in
"argocd")
check_argocd_health "$2"
;;
"flux")
check_flux_health
;;
"debug")
debug_sync_issues "$2"
;;
"rbac")
check_rbac_permissions
;;
"git")
analyze_git_issues "$2"
;;
"perf")
performance_analysis
;;
"network")
network_connectivity_check
;;
"report")
generate_report
;;
*)
echo "Usage: $0 {argocd|flux|debug|rbac|git|perf|network|report} [app-name|repo-url]"
echo
echo "Commands:"
echo " argocd [app] - Check ArgoCD health for specific app"
echo " flux - Check Flux health"
echo " debug [app] - Debug sync issues"
echo " rbac - Check RBAC permissions"
echo " git [repo] - Analyze Git connectivity"
echo " perf - Performance analysis"
echo " network - Network connectivity check"
echo " report - Generate comprehensive report"
exit 1
;;
esac
}
main "$@"
결론: GitOps로 안전하고 효율적인 배포 문화 구축
GitOps는 2026년 현재 Kubernetes 환경에서의 사실상 표준 배포 방법론으로 자리잡았습니다. ArgoCD와 Flux로 대표되는 도구들이 성숙해지면서, 기업들은 더 안전하고 추적 가능하며 자동화된 배포 환경을 구축할 수 있게 되었습니다.
GitOps 도입의 핵심 이점:
- 선언적 배포: Git을 통한 모든 변경사항의 명시적 관리
- 자동화와 일관성: 환경 간 배포 프로세스의 표준화
- 보안 강화: Pull 기반 접근법으로 클러스터 접근 권한 최소화
- 협업 개선: 코드 리뷰 프로세스를 통한 배포 품질 관리
성공적인 GitOps 구축을 위한 핵심 요소:
- 점진적 도입: 작은 애플리케이션부터 시작해서 단계적 확산
- 팀 교육: GitOps 개념과 도구 사용법에 대한 충분한 이해
- 모니터링 체계: 배포 상태와 성능에 대한 지속적인 관찰
- 보안 정책: 시크릿 관리와 접근 제어의 체계적 구축
GitOps는 단순한 배포 자동화를 넘어, 개발팀과 운영팀이 함께 만들어가는 새로운 협업 문화입니다. 올바르게 구현된 GitOps 환경에서는 배포가 더 이상 두려운 일이 아니라 일상적이고 안전한 업무 프로세스가 됩니다.