Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
897 changes: 897 additions & 0 deletions docs/index.html

Large diffs are not rendered by default.

205 changes: 205 additions & 0 deletions infrastructure/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# Lambda Python Layer Builder — Infrastructure

Serverless infrastructure that builds AWS Lambda Python layers on-demand using EC2 Spot instances and Docker, with a GitHub Pages frontend.

## Architecture

```
┌──────────────────────────────────────────────────────────────────────┐
│ GitHub Pages (docs/index.html) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ requirements.txt │ Python version │ Architecture │ Submit │ │
│ └─────────────────────────┬──────────────────────────────────┘ │
└────────────────────────────┼─────────────────────────────────────────┘
│ POST /builds
┌──────────────────────────────────────────────────────────────────────┐
│ API Gateway (HTTP API) │
│ POST /builds → submit_build Lambda │
│ GET /builds/{id} → check_status Lambda │
└───────────┬──────────────────────────────────────┬───────────────────┘
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────────┐
│ submit_build λ │ │ check_status λ │
│ • Validates input│ │ • Reads DynamoDB │
│ • Creates record │ │ • Generates presigned│
│ • Sends to SQS │ │ S3 download URLs │
└─────────┬─────────┘ └───────────┬───────────┘
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────────┐
│ SQS Build Queue │ │ DynamoDB │
│ (with DLQ) │ │ buildId | status │
└─────────┬─────────┘ │ s3_keys | TTL │
│ └───────────────────────┘
▼ ▲
┌───────────────────┐ │
│ process_build λ │ │
│ • Launches EC2 │ │
│ Spot instance │ │
└─────────┬─────────┘ │
│ │
▼ │
┌──────────────────────────────────────────────────┼───────────────────┐
│ EC2 Spot Instance │ │
│ ┌─────────────────────────────────┐ │ │
│ │ 1. Install Docker │ │ │
│ │ 2. Pull/build Docker image │ │ │
│ │ 3. Run container to build │ │ │
│ │ Lambda layer zip files │ │ │
│ │ 4. Upload zips to S3 ─────────┼──┐ │ │
│ │ 5. Update DynamoDB status ─────┼──┼──────────┘ │
│ │ 6. Self-terminate │ │ │
│ └─────────────────────────────────┘ │ │
└───────────────────────────────────────┼──────────────────────────────┘
┌───────────────────┐
│ S3 Artifacts │
│ builds/{id}/*.zip │
│ Lifecycle: 24h │
└───────────────────┘
```

## Flow

1. **User** opens GitHub Pages, enters `requirements.txt`, selects Python version & architecture
2. **API Gateway** routes `POST /builds` to `submit_build` Lambda
3. **submit_build** validates input, creates DynamoDB record (QUEUED), sends SQS message
4. **SQS** triggers `process_build` Lambda
5. **process_build** launches an EC2 Spot instance with a user-data script
6. **EC2 instance** installs Docker, pulls pre-built images from GHCR (or builds from Dockerfile), runs the build, uploads zips to S3, updates DynamoDB (COMPLETED), self-terminates
7. **User** frontend polls `GET /builds/{id}` which returns status + presigned S3 download URLs
8. **Artifacts** auto-expire from S3 after configurable TTL (default 24h)

## Cost Estimate

| Component | Cost | Notes |
|-----------|------|-------|
| EC2 Spot (c5.xlarge) | ~$0.04/hr | ~$0.01 per build (15 min avg) |
| S3 | ~$0.023/GB/month | Artifacts auto-expire |
| Lambda | ~$0.20/1M requests | Minimal usage |
| API Gateway | $1.00/1M requests | HTTP API pricing |
| DynamoDB | Pay-per-request | ~$0.00 for low volume |
| SQS | $0.40/1M messages | Negligible |
| **Total (idle)** | **~$0/month** | No running infrastructure |
| **Per build** | **~$0.01-0.03** | Spot instance + S3 |

## Prerequisites

- AWS account with permissions to create VPC, EC2, Lambda, S3, SQS, DynamoDB, API Gateway, IAM
- [Terraform](https://www.terraform.io/downloads) >= 1.5.0
- AWS CLI configured (`aws configure`)

## Deployment

```bash
cd infrastructure/terraform

# Copy and customize configuration
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your preferences

# Initialize and deploy
terraform init
terraform plan
terraform apply
```

After deployment, note the `api_url` output:

```
Outputs:
api_url = "https://xxxxxxxxxx.execute-api.eu-central-1.amazonaws.com"
```

### Configure GitHub Pages

1. In your GitHub repository: **Settings → Pages → Source: Deploy from a branch**
2. Select **Branch: main**, **Folder: /docs**
3. Open your GitHub Pages URL
4. Click **⚙ API Settings** and paste the `api_url` from Terraform output
5. Start building layers!

## Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `aws_region` | `eu-central-1` | AWS region |
| `environment` | `prod` | Environment name |
| `artifact_ttl_hours` | `24` | Hours to keep artifacts in S3 |
| `ec2_instance_type` | `c5.xlarge` | Spot instance type |
| `ec2_volume_size` | `50` | EBS volume size (GB) |
| `ec2_max_build_time_minutes` | `30` | Safety timeout per build |
| `allowed_origins` | `["*"]` | CORS origins |
| `docker_image_prefix` | `ghcr.io/fok666/lambda-python-layer` | Pre-built image registry |

## API Reference

### POST /builds

Submit a new build request.

```json
{
"requirements": "numpy==1.26.4\nrequests==2.32.4",
"python_version": "3.13",
"architectures": ["x86_64", "arm64"],
"single_file": true
}
```

**Response:**
```json
{
"build_id": "a1b2c3d4-...",
"status": "QUEUED",
"expires_at": 1709398800
}
```

### GET /builds/{buildId}

Check build status. Returns presigned download URLs when completed.

**Response (completed):**
```json
{
"build_id": "a1b2c3d4-...",
"status": "COMPLETED",
"python_version": "3.13",
"architectures": ["x86_64", "arm64"],
"files": [
{
"filename": "combined-python3.13-x86_64.zip",
"download_url": "https://s3.amazonaws.com/...",
"architecture": "x86_64"
},
{
"filename": "combined-python3.13-aarch64.zip",
"download_url": "https://s3.amazonaws.com/...",
"architecture": "arm64"
}
]
}
```

## Security

- **S3 bucket**: Private, no public access. Downloads via presigned URLs only
- **EC2 instances**: No SSH, no inbound ports. Egress-only security group
- **IMDSv2**: Enforced on all EC2 instances
- **EBS encryption**: Enabled by default
- **IAM**: Least-privilege policies per component
- **DynamoDB TTL**: Automatic cleanup of old records
- **S3 lifecycle**: Automatic deletion of old artifacts

## Teardown

```bash
cd infrastructure/terraform
terraform destroy
```

> **Note:** S3 bucket must be empty before destruction. Terraform will fail if artifacts exist. Wait for lifecycle expiration or manually empty the bucket.
147 changes: 147 additions & 0 deletions infrastructure/lambdas/check_status/index.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
"""
Check Status Lambda
Returns the build status and generates presigned download URLs
for completed builds.

API: GET /builds/{buildId}
Response: {
"build_id": "uuid",
"status": "COMPLETED",
"python_version": "3.13",
"architectures": ["x86_64", "arm64"],
"created_at": 1709312400,
"expires_at": 1709398800,
"files": [
{
"filename": "combined-python3.13-x86_64.zip",
"download_url": "https://...",
"architecture": "x86_64"
}
]
}
"""

import json
import os
import re
import boto3
from botocore.exceptions import ClientError

dynamodb = boto3.resource("dynamodb")
s3_client = boto3.client("s3")

TABLE_NAME = os.environ["DYNAMODB_TABLE"]
S3_BUCKET = os.environ["S3_BUCKET"]
ARTIFACT_TTL_HOURS = int(os.environ.get("ARTIFACT_TTL_HOURS", "24"))

# Presigned URL expiry matches artifact TTL (capped at 7 days for S3 limit)
PRESIGN_EXPIRY = min(ARTIFACT_TTL_HOURS * 3600, 604800)


def handler(event, context):
"""Handle GET /builds/{buildId} requests."""
# Extract buildId from path parameters
build_id = (event.get("pathParameters") or {}).get("buildId")

if not build_id:
return _response(400, {"error": "buildId is required"})

# Validate UUID format
uuid_pattern = re.compile(
r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", re.I
)
Comment on lines +50 to +52

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expression for UUID validation is compiled inside the handler function. This means it gets recompiled on every invocation of the Lambda function, which is inefficient. For better performance, you should define the compiled regex pattern at the module level so it is compiled only once when the Lambda execution environment is initialized.

import re

uuid_pattern = re.compile(
    r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", re.I
)

if not uuid_pattern.match(build_id):
return _response(400, {"error": "Invalid buildId format"})

# Fetch build record
table = dynamodb.Table(TABLE_NAME)
try:
result = table.get_item(Key={"buildId": build_id})
except ClientError as e:
print(f"DynamoDB error: {e}")
return _response(500, {"error": "Failed to retrieve build status"})

item = result.get("Item")
if not item:
return _response(404, {"error": "Build not found"})

# Build base response
response_body = {
"build_id": item["buildId"],
"status": item["status"],
"python_version": item.get("python_version", "unknown"),
"architectures": item.get("architectures", []),
"single_file": item.get("single_file", True),
"created_at": int(item.get("created_at", 0)),
"expires_at": int(item.get("expires_at", 0)),
}

# Add error message if failed
if item.get("error_message"):
response_body["error_message"] = item["error_message"]

# Add completed timestamp
if item.get("completed_at"):
response_body["completed_at"] = int(item["completed_at"])

# Generate presigned download URLs for completed builds
if item["status"] == "COMPLETED" and item.get("s3_keys"):
s3_keys = item["s3_keys"].split(",")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Storing s3_keys as a comma-separated string in DynamoDB is fragile. If a filename were to ever contain a comma, it would break the parsing logic here. A more robust approach is to use a data type that natively supports lists, such as a DynamoDB String Set (SS) or a JSON-encoded list string. This would require updating the process_build user-data script to store the keys in the new format.

        s3_keys = json.loads(item.get("s3_keys", "[]"))

files = []

for s3_key in s3_keys:
s3_key = s3_key.strip()
if not s3_key:
continue

filename = s3_key.split("/")[-1]
architecture = _detect_architecture(filename)

try:
download_url = s3_client.generate_presigned_url(
"get_object",
Params={"Bucket": S3_BUCKET, "Key": s3_key},
ExpiresIn=PRESIGN_EXPIRY,
)
files.append({
"filename": filename,
"download_url": download_url,
"architecture": architecture,
"s3_key": s3_key,
})
except ClientError as e:
print(f"Failed to generate presigned URL for {s3_key}: {e}")
files.append({
"filename": filename,
"architecture": architecture,
"error": "Failed to generate download URL",
})

response_body["files"] = files
response_body["file_count"] = len(files)

return _response(200, response_body)


def _detect_architecture(filename):
"""Detect architecture from filename."""
filename_lower = filename.lower()
if "x86_64" in filename_lower or "amd64" in filename_lower:
return "x86_64"
elif "aarch64" in filename_lower or "arm64" in filename_lower:
return "arm64"
return "unknown"


def _response(status_code, body):
"""Create API Gateway response with CORS headers."""
return {
"statusCode": status_code,
"headers": {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Access-Control-Allow-Origin header is hardcoded to *. While your API Gateway configuration allows for specific origins via the allowed_origins variable, this hardcoded value in the Lambda response will always be sent. This is a security concern as it is overly permissive and prevents the use of credentials with requests. The allowed origins should be configurable and validated against the request's Origin header.

            "Access-Control-Allow-Origin": os.environ.get("ALLOWED_ORIGIN", "*"),

"Access-Control-Allow-Headers": "Content-Type",
"Access-Control-Allow-Methods": "POST,GET,OPTIONS",
},
"body": json.dumps(body),
}
Loading