This repository was archived by the owner on Mar 12, 2026. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
Implement AWS lab #3
Open
kalebcastillo
wants to merge
4
commits into
main
Choose a base branch
from
feat/aws-lab
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
519a468
Implement AWS lab
kalebcastillo d74980e
Update .github/skills/validate-devops-lab/aws/scripts/run-full-valida…
kalebcastillo 95518c1
Apply fix for INC 5, add validation scripts to gitignore
kalebcastillo 851516a
Add OIDC to validate script, add readme note to suggest OIDC
kalebcastillo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
745 changes: 0 additions & 745 deletions
745
.github/skills/validate-devops-lab/azure/scripts/run-full-validation.sh
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,14 +1,167 @@ | ||
| # AWS DevOps Lab | ||
|
|
||
| 🚧 **Coming soon.** | ||
| Fix a broken DevOps pipeline deployed to AWS. Work through 7 incidents to get the application running. | ||
|
|
||
| The AWS version of this lab will use equivalent services: | ||
| ``` | ||
| ┌─────────────────────────────────────────────────────────────┐ | ||
| │ AWS Resources │ | ||
| │ │ | ||
| │ ┌──────────┐ ┌──────────┐ ┌────────────────────────┐ │ | ||
| │ │ VPC │ │ ECR │ │ EKS │ │ | ||
| │ │ │ │ (images) │──▶│ ┌─────┐ ┌───────┐ │ │ | ||
| │ │ Subnet │ │ │ │ │ App │──│ Redis │ │ │ | ||
| │ │ │ └──────────┘ │ └─────┘ └───────┘ │ │ | ||
| │ └──────────┘ └────────────────────────┘ │ | ||
| │ │ | ||
| │ ┌──────────────────┐ ┌───────────────────────────────┐ │ | ||
| │ │ CloudWatch │ │ Container Insights │ │ | ||
| │ │ Log Group │ │ + Alarms │ │ | ||
| │ └──────────────────┘ └───────────────────────────────┘ │ | ||
| └─────────────────────────────────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| | Azure | AWS Equivalent | | ||
| |-------|---------------| | ||
| | ACR | ECR | | ||
| | AKS | EKS | | ||
| | Azure Monitor | CloudWatch | | ||
| | VNet | VPC | | ||
| ## Prerequisites | ||
|
|
||
| Want to help build it? See our [Contributing Guide](../CONTRIBUTING.md). | ||
| - [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) | ||
| - [Terraform](https://developer.hashicorp.com/terraform/install) (v1.0+) | ||
| - [Docker](https://docs.docker.com/get-docker/) | ||
| - [kubectl](https://kubernetes.io/docs/tasks/tools/) | ||
|
|
||
| ## Getting Started | ||
|
|
||
| 1. Clone this repo and navigate to the AWS scripts: | ||
| ```bash | ||
| git clone https://github.com/learntocloud/devops-lab | ||
| cd devops-lab/aws/scripts | ||
| ``` | ||
|
|
||
| 2. Log in to AWS: | ||
| ```bash | ||
| aws configure | ||
| ``` | ||
|
|
||
| 3. Run the setup script: | ||
| ```bash | ||
| chmod +x *.sh | ||
| ./setup.sh | ||
| ``` | ||
|
|
||
| **Cost**: ~$3-5/session. Destroy resources when done. | ||
|
|
||
| --- | ||
|
|
||
| ## Incident Queue | ||
|
|
||
| You're the new DevOps engineer. Seven incidents are waiting. Diagnose and fix each one. | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-001: Container Image Won't Build | ||
|
|
||
| **Priority:** High | ||
| **Reported by:** Development Team | ||
| **Tools:** `docker` CLI | ||
|
|
||
| > "We can't build the app's Docker image. The `docker build` command fails immediately with errors. The Dockerfile is at `aws/docker/Dockerfile`. We need the image to build successfully and the container to start and respond on the correct port." | ||
|
|
||
| **What to fix:** `aws/docker/Dockerfile` | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-002: Local Dev Environment Broken | ||
|
|
||
| **Priority:** High | ||
| **Reported by:** Development Team | ||
| **Tools:** `docker compose` CLI | ||
|
|
||
| > "Docker Compose won't bring up our local environment. The app can't connect to Redis, and the port mapping seems wrong. The compose file is at `aws/docker/docker-compose.yml`. We need both services (app + redis) to start and communicate." | ||
|
|
||
| **What to fix:** `aws/docker/docker-compose.yml` | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-003: CI Pipeline is Broken | ||
|
|
||
| **Priority:** High | ||
| **Reported by:** Engineering Manager | ||
| **Tools:** GitHub Actions YAML reference | ||
|
|
||
| > "Our CI workflow has YAML errors and the steps are in the wrong order. Tests run before dependencies are installed, and some action versions look wrong. The workflow is at `aws/github-actions/ci.yml`." | ||
|
|
||
| **What to fix:** `aws/github-actions/ci.yml` | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-004: Terraform Can't Provision Infrastructure | ||
|
|
||
| **Priority:** Critical | ||
| **Reported by:** Platform Team | ||
| **Tools:** `terraform` CLI, `aws` CLI | ||
|
|
||
| > "Terraform plan fails with multiple errors. There are typos in resource types, something is wrong with the IAM role policies, and the cluster networking configuration has conflicts. The config is at `aws/terraform/`. We need the VPC, ECR, EKS cluster, and monitoring log group to all deploy successfully." | ||
|
|
||
| **What to fix:** `aws/terraform/main.tf`, `aws/terraform/outputs.tf` | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-005: Deployment Pipeline Failing | ||
|
|
||
| **Priority:** High | ||
| **Reported by:** Release Team | ||
| **Tools:** GitHub Actions YAML reference, `aws` CLI | ||
|
|
||
| > "The CD pipeline can't deploy to EKS. The AWS credentials action is misconfigured, and the deployment steps aren't right. The workflow is at `aws/github-actions/cd.yml`." | ||
|
|
||
| **What to fix:** `aws/github-actions/cd.yml` | ||
|
|
||
| **Note:** OIDC is the recommended approach. | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-006: Kubernetes Deployment Crashing | ||
|
|
||
| **Priority:** Critical | ||
| **Reported by:** SRE Team | ||
| **Tools:** `kubectl` CLI | ||
|
|
||
| > "Pods won't start in EKS. The deployments have wrong API versions, label selectors don't match between deployments and services, container ports are wrong, and the readiness probe is hitting an endpoint that doesn't exist. Manifests are in `aws/kubernetes/`." | ||
|
|
||
| **What to fix:** `aws/kubernetes/app-deployment.yaml`, `aws/kubernetes/app-service.yaml`, `aws/kubernetes/redis-deployment.yaml`, `aws/kubernetes/redis-service.yaml` | ||
|
|
||
| --- | ||
|
|
||
| ### 🎫 INC-007: Monitoring Not Working | ||
|
|
||
| **Priority:** Medium | ||
| **Reported by:** Observability Team | ||
| **Tools:** `aws` CLI | ||
|
|
||
| > "The pod restart alarm is disabled and should be enabled. We need Container Insights running on EKS, and our alarm configuration at `aws/monitoring/alerts.json` needs fixing. The alarm for pod restarts should be severity 2 (not 1), and it should evaluate every minute (not every 5 minutes)." | ||
|
|
||
| **What to fix:** `aws/monitoring/alerts.json` | ||
|
|
||
| --- | ||
|
|
||
| ## Verify Your Fixes | ||
|
|
||
| Check incident status anytime: | ||
|
|
||
| ```bash | ||
| cd aws/scripts | ||
| ./validate.sh | ||
| ``` | ||
|
|
||
| Generate your completion token after all incidents are resolved: | ||
|
|
||
| ```bash | ||
| ./validate.sh export | ||
| ``` | ||
|
|
||
| ## Clean Up | ||
|
|
||
| **Always destroy resources when done to avoid charges:** | ||
|
|
||
| ```bash | ||
| cd aws/scripts | ||
| ./destroy.sh | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| from fastapi import FastAPI | ||
| import redis | ||
| import os | ||
|
|
||
| app = FastAPI(title="DevOps Lab App") | ||
|
|
||
| REDIS_HOST = os.getenv("REDIS_HOST", "localhost") | ||
| REDIS_PORT = int(os.getenv("REDIS_PORT", "6379")) | ||
|
|
||
|
|
||
| def get_redis(): | ||
| try: | ||
| r = redis.Redis(host=REDIS_HOST, port=REDIS_PORT, decode_responses=True) | ||
| r.ping() | ||
| return r | ||
| except redis.ConnectionError: | ||
| return None | ||
|
|
||
|
|
||
| @app.get("/health") | ||
| def health(): | ||
| r = get_redis() | ||
| redis_status = "connected" if r else "disconnected" | ||
| return {"status": "healthy", "redis": redis_status} | ||
|
|
||
|
|
||
| @app.get("/api/status") | ||
| def status(): | ||
| r = get_redis() | ||
| if r: | ||
| visits = r.incr("visits") | ||
| else: | ||
| visits = -1 | ||
| return { | ||
| "app": "devops-lab", | ||
| "version": "1.0.0", | ||
| "visits": visits, | ||
| "redis": "connected" if r else "disconnected", | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| fastapi==0.115.0 | ||
| uvicorn==0.30.0 | ||
| redis==5.0.0 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| from fastapi.testclient import TestClient | ||
| from app import app | ||
|
|
||
| client = TestClient(app) | ||
|
|
||
|
|
||
| def test_health(): | ||
| response = client.get("/health") | ||
| assert response.status_code == 200 | ||
| data = response.json() | ||
| assert data["status"] == "healthy" | ||
|
|
||
|
|
||
| def test_status(): | ||
| response = client.get("/api/status") | ||
| assert response.status_code == 200 | ||
| data = response.json() | ||
| assert data["app"] == "devops-lab" | ||
| assert data["version"] == "1.0.0" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # Dockerfile for DevOps Lab App | ||
| FROM python:3.11-slm | ||
|
|
||
| WORKDIR /src | ||
|
|
||
| COPY app/requirements.txt . | ||
|
|
||
| RUN pip install -r requirements.txt | ||
|
|
||
| COPY app/ . | ||
|
|
||
| EXPOSE 5000 | ||
|
|
||
| CMD ["python", "app.py"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| version: "3.8" | ||
|
|
||
| services: | ||
| app: | ||
| build: | ||
| context: . | ||
| dockerfile: Dockerfile | ||
| ports: | ||
| - "8000:5000" | ||
| environment: | ||
| - REDIS_HOST=cache | ||
| - REDIS_PORT=6379 | ||
| depends_on: | ||
| - cache | ||
| networks: | ||
| - backend | ||
|
|
||
| redis: | ||
| image: redis:alpine | ||
| ports: | ||
| - "6379:6379" | ||
| volumes: | ||
| - redis_data:/data | ||
|
|
||
| volumes: | ||
| redis_data: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| name: CD Pipeline | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| push: | ||
| branches: [main] | ||
|
|
||
| env: | ||
| AWS_REGION: ${{ secrets.AWS_REGION }} | ||
| ECR_REPO: ${{ secrets.ECR_REPO }} | ||
| EKS_CLUSTER: ${{ secrets.EKS_CLUSTER_NAME }} | ||
|
|
||
| jobs: | ||
| build-and-push: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Configure AWS credentials | ||
| uses: aws-actions/configure-aws-credentials@v4 | ||
| with: | ||
| credentials: ${{ secrets.AWS_CREDENTIALS }} | ||
| aws-region: ${{ env.AWS_REGION }} | ||
|
|
||
| - name: Login to ECR | ||
| run: | | ||
| aws ecr get-login-password --region ${{ env.AWS_REGION }} | \ | ||
| docker login --username AWS --password-stdin ${{ env.ECR_REPO }} | ||
|
Comment on lines
+8
to
+29
|
||
|
|
||
| - name: Build and push image | ||
| run: | | ||
| docker build -f aws/docker/Dockerfile -t ${{ env.ECR_REPO }}:latest . | ||
| docker push ${{ env.ECR_REPO }}:latest | ||
|
|
||
| deploy-to-eks: | ||
| runs-on: ubuntu-latest | ||
| needs: build-and-push | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Configure AWS credentials | ||
| uses: aws-actions/configure-aws-credentials@v4 | ||
| with: | ||
| credentials: ${{ secrets.AWS_CREDENTIALS }} | ||
| aws-region: ${{ env.AWS_REGION }} | ||
|
|
||
| - name: Get EKS credentials | ||
| run: | | ||
| aws eks update-kubeconfig --region ${{ env.AWS_REGION }} --name ${{ env.EKS_CLUSTER }} | ||
|
|
||
| - name: Deploy to EKS | ||
| run: | | ||
| kubectl set image deployment/devops-lab-app app=${{ env.ECR_REPO }}:latest -n devops-lab | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| name: CI Pipeline | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main] | ||
| pull_request: | ||
| branches: [main] | ||
|
|
||
| jobs: | ||
| build-and-test: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v99 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.11" | ||
|
|
||
| - name: Run tests | ||
| working-directory: ./aws/app | ||
| run: | | ||
| python -m pytest tests/ -v | ||
|
|
||
| - name: Install dependencies | ||
| working-directory: ./aws/app | ||
| run: | | ||
| pip install -r requirements.txt | ||
|
|
||
| - name: Build Docker image | ||
| run: | | ||
| docker build -f aws/docker/Dockerfile -t devops-lab-app . |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aws/scripts/validate.shuses Python with theyamlmodule to validate GitHub Actions workflows, but the AWS lab prerequisites don’t mention Python/PyYAML. Add Python 3 + PyYAML (or adjust validation to avoid requiring PyYAML) so students can actually reach 7/7 incident resolution and export a token.