Skip to content

feat: Implement Lambda functions for building Python layers#16

Merged
fok666 merged 1 commit intomainfrom
feature/selfservice-builds
Mar 1, 2026
Merged

feat: Implement Lambda functions for building Python layers#16
fok666 merged 1 commit intomainfrom
feature/selfservice-builds

Conversation

@fok666
Copy link
Owner

@fok666 fok666 commented Mar 1, 2026

  • Added submit_build Lambda function to handle build requests, validate input, and store build information in DynamoDB.
  • Created process_build Lambda function to process SQS messages and launch EC2 instances for building layers.
  • Introduced check_status Lambda function to retrieve build status and generate presigned URLs for artifacts.

chore: Set up Terraform infrastructure for the build system

  • Configured API Gateway with routes for submitting builds and checking build status.
  • Created DynamoDB table for tracking build states with TTL for automatic cleanup.
  • Established SQS queue for managing build requests and a dead letter queue for failed messages.
  • Defined EC2 launch template for build workers using Amazon Linux 2023.
  • Set up IAM roles and policies for Lambda functions and EC2 instances.
  • Configured S3 bucket for storing build artifacts with lifecycle policies for expiration.
  • Added necessary Terraform variables and outputs for deployment configuration.
  • Included .gitignore for Terraform files and build artifacts.

- Added `submit_build` Lambda function to handle build requests, validate input, and store build information in DynamoDB.
- Created `process_build` Lambda function to process SQS messages and launch EC2 instances for building layers.
- Introduced `check_status` Lambda function to retrieve build status and generate presigned URLs for artifacts.

chore: Set up Terraform infrastructure for the build system

- Configured API Gateway with routes for submitting builds and checking build status.
- Created DynamoDB table for tracking build states with TTL for automatic cleanup.
- Established SQS queue for managing build requests and a dead letter queue for failed messages.
- Defined EC2 launch template for build workers using Amazon Linux 2023.
- Set up IAM roles and policies for Lambda functions and EC2 instances.
- Configured S3 bucket for storing build artifacts with lifecycle policies for expiration.
- Added necessary Terraform variables and outputs for deployment configuration.
- Included .gitignore for Terraform files and build artifacts.
@fok666 fok666 self-assigned this Mar 1, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes a robust, serverless system designed to automate the creation of AWS Lambda Python layers. It orchestrates various AWS services through Terraform, enabling users to submit Python dependency lists via a web interface. The system then dynamically provisions EC2 Spot instances to build the layers using Docker, stores the resulting artifacts in S3, and allows users to track the build status and download the compiled layers.

Highlights

  • Lambda Functions Implemented: Three new Lambda functions were introduced: submit_build for handling build requests and storing them in DynamoDB, process_build for consuming SQS messages and launching EC2 instances for the actual layer building, and check_status for retrieving build progress and generating presigned S3 URLs for artifacts.
  • Terraform Infrastructure Setup: Comprehensive Terraform configurations were added to provision the entire build system. This includes API Gateway routes, a DynamoDB table for build states with TTL, an SQS queue with a dead-letter queue, an EC2 launch template for build workers, IAM roles and policies, and an S3 bucket for artifacts with lifecycle management.
  • Frontend Interface Added: A new docs/index.html file provides a user-friendly web interface for submitting build requests and monitoring their status, integrating with the deployed API Gateway.
  • EC2 Spot Instance Utilization: The build process leverages cost-effective EC2 Spot instances, which are launched on demand and self-terminate after completing the build or timing out, ensuring efficient resource usage.
  • Detailed Infrastructure Documentation: A new infrastructure/README.md provides a thorough overview of the system's architecture, operational flow, cost estimates, deployment instructions, API reference, and security considerations.
Changelog
  • docs/index.html
    • Added a new HTML file for the frontend interface of the Lambda Python Layer Builder.
  • infrastructure/README.md
    • Added a new README file detailing the infrastructure's architecture, flow, cost estimates, deployment, configuration, API reference, security, and teardown.
  • infrastructure/lambdas/check_status/index.py
    • Added a new Python Lambda function to retrieve build status from DynamoDB and generate presigned S3 URLs for completed artifacts.
  • infrastructure/lambdas/process_build/index.py
    • Added a new Python Lambda function to process SQS messages, launch EC2 Spot instances, and generate user data scripts for Docker-based layer builds.
  • infrastructure/lambdas/submit_build/index.py
    • Added a new Python Lambda function to validate build requests, store build metadata in DynamoDB, and enqueue messages to SQS.
  • infrastructure/terraform/.gitignore
    • Added a new .gitignore file to exclude Terraform state, build artifacts, and sensitive variable files.
  • infrastructure/terraform/api_gateway.tf
    • Added Terraform configuration for API Gateway, including HTTP API, stages, routes for build submission and status checks, and integrations with Lambda functions.
  • infrastructure/terraform/dynamodb.tf
    • Added Terraform configuration for a DynamoDB table to store build metadata with a Time-To-Live (TTL) attribute for automatic cleanup.
  • infrastructure/terraform/ec2.tf
    • Added Terraform configuration for an EC2 launch template, specifying the Amazon Linux 2023 AMI, instance type, IAM profile, security groups, and EBS volume settings for build workers.
  • infrastructure/terraform/iam.tf
    • Added Terraform configuration for IAM roles and policies, ensuring least privilege access for Lambda functions and EC2 instances to interact with other AWS services.
  • infrastructure/terraform/lambda.tf
    • Added Terraform configuration to package and deploy the submit_build, process_build, and check_status Lambda functions, including their environment variables and the SQS trigger for process_build.
  • infrastructure/terraform/main.tf
    • Added main Terraform configuration, defining AWS provider, required versions, local variables for naming, and default tags for resources.
  • infrastructure/terraform/outputs.tf
    • Added Terraform outputs for key infrastructure endpoints and names, such as the API Gateway URL, S3 bucket name, DynamoDB table name, SQS queue URL, and Lambda function names.
  • infrastructure/terraform/s3.tf
    • Added Terraform configuration for an S3 bucket to store build artifacts, including versioning, lifecycle rules for expiration, server-side encryption, public access blocking, and CORS settings.
  • infrastructure/terraform/sqs.tf
    • Added Terraform configuration for an SQS queue to manage build requests and a corresponding dead-letter queue for failed messages.
  • infrastructure/terraform/terraform.tfvars.example
    • Added an example Terraform variables file with default configurations for easy customization.
  • infrastructure/terraform/variables.tf
    • Added Terraform variable definitions with detailed descriptions, types, defaults, and validation rules for all configurable infrastructure parameters.
Activity
  • The author, fok666, initiated the pull request to implement the core Lambda functions and set up the foundational Terraform infrastructure for the Python layer building system.
  • The pull request includes the initial creation of all specified files, indicating a comprehensive new feature development effort.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@fok666 fok666 merged commit bfe2536 into main Mar 1, 2026
14 checks passed
@fok666 fok666 deleted the feature/selfservice-builds branch March 1, 2026 13:19
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive serverless system for building AWS Lambda Python layers, including the backend infrastructure (Lambda, SQS, DynamoDB, EC2, S3) defined in Terraform, and a web frontend. The architecture is well-designed and cost-conscious. My review focuses on improving security, robustness, and performance. I've identified a critical security vulnerability related to command injection in the user-data script generation, a high-severity issue with the EC2 architecture selection that impacts performance, and several medium-severity issues concerning API consistency, configuration, and code robustness in both the backend and frontend.

Comment on lines +152 to +357
def _generate_user_data(build_id, python_version, architectures, requirements, single_file):
"""Generate the EC2 user-data bash script for the build."""
req_escaped = requirements.replace("\\", "\\\\").replace("'", "'\\''")
arches_str = " ".join(architectures)
single_file_str = "true" if single_file else "false"

return f"""#!/bin/bash
set -euo pipefail
exec > >(tee /var/log/build.log) 2>&1

echo "$(date): === Lambda Layer Builder ==="
echo "Build ID: {build_id}"
echo "Python: {python_version}"
echo "Architectures: {arches_str}"
echo "Single file: {single_file_str}"

# --- Instance metadata (IMDSv2) ---
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" \
-H "X-aws-ec2-metadata-token-ttl-seconds: 300")
REGION=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
http://169.254.169.254/latest/meta-data/placement/region)
INSTANCE_ID=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
http://169.254.169.254/latest/meta-data/instance-id)

export AWS_DEFAULT_REGION="$REGION"

# --- Safety: auto-terminate after {MAX_BUILD_MINUTES} minutes ---
(sleep {MAX_BUILD_MINUTES * 60} && \
echo "$(date): TIMEOUT - self-terminating" && \
aws dynamodb update-item \
--table-name "{TABLE_NAME}" \
--key '{{"buildId": {{"S": "{build_id}"}}}}' \
--update-expression "SET #s = :s, error_message = :e" \
--expression-attribute-names '{{"#s": "status"}}' \
--expression-attribute-values '{{":s": {{"S": "FAILED"}}, ":e": {{"S": "Build timed out after {MAX_BUILD_MINUTES} minutes"}}}}' && \
aws ec2 terminate-instances --instance-ids "$INSTANCE_ID") &
WATCHDOG_PID=$!

# --- Helper functions ---
update_status() {{
local status=$1
local extra="${{2:-}}"
aws dynamodb update-item \
--table-name "{TABLE_NAME}" \
--key '{{"buildId": {{"S": "{build_id}"}}}}' \
--update-expression "SET #s = :s${{extra}}" \
--expression-attribute-names '{{"#s": "status"}}' \
--expression-attribute-values "$(echo '{{":s": {{"S": "'"$status"'"}}}}' )" \
2>/dev/null || true
}}

cleanup() {{
echo "$(date): Cleanup initiated"
kill $WATCHDOG_PID 2>/dev/null || true
echo "$(date): Self-terminating instance $INSTANCE_ID"
aws ec2 terminate-instances --instance-ids "$INSTANCE_ID" 2>/dev/null || true
}}
trap cleanup EXIT

# --- Install Docker ---
echo "$(date): Installing Docker..."
dnf install -y docker git aws-cli 2>/dev/null || yum install -y docker git aws-cli
systemctl start docker
systemctl enable docker

# Enable QEMU for cross-architecture builds
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes 2>/dev/null || true

# --- Create requirements file ---
mkdir -p /build/input /build/output
cat > /build/input/requirements.txt << 'REQUIREMENTS_EOF'
{requirements}
REQUIREMENTS_EOF

echo "$(date): Requirements:"
cat /build/input/requirements.txt

# --- Configuration ---
DOCKER_IMAGE_PREFIX="{DOCKER_IMAGE_PREFIX}"
S3_BUCKET="{S3_BUCKET}"
BUILD_ID="{build_id}"
PYTHON_VERSION="{python_version}"
SINGLE_FILE="{single_file_str}"

# --- Build function ---
build_arch() {{
local arch=$1
local platform=""
local arch_label=""

if [ "$arch" = "x86_64" ]; then
platform="linux/amd64"
arch_label="amd64"
else
platform="linux/arm64"
arch_label="arm64"
fi

echo ""
echo "$(date): ========================================="
echo "$(date): Building for $arch ($platform)"
echo "$(date): ========================================="

local image_tag="${{DOCKER_IMAGE_PREFIX}}:python${{PYTHON_VERSION}}-${{arch_label}}-latest"

# Try pre-built image first, fall back to local build
if docker pull --platform "$platform" "$image_tag" 2>/dev/null; then
echo "$(date): Using pre-built image: $image_tag"
else
echo "$(date): Pre-built image unavailable, building locally..."

if [ ! -d /build/repo ]; then
git clone {GITHUB_REPO_URL} /build/repo
fi

# Select correct Dockerfile based on Python version
local dockerfile="/build/repo/Dockerfile.al2023"
if [[ "$PYTHON_VERSION" == "3.10" || "$PYTHON_VERSION" == "3.11" ]]; then
dockerfile="/build/repo/Dockerfile.al2"
fi

docker buildx create --use --name builder 2>/dev/null || true
docker buildx build \
--platform "$platform" \
--build-arg PYTHON_VERSION=$PYTHON_VERSION \
-t "$image_tag" \
--load \
-f "$dockerfile" \
/build/repo/
fi

# Run the build container
if [ "$SINGLE_FILE" = "true" ]; then
docker run --rm \
--platform "$platform" \
-e SINGLE_FILE=true \
-v /build/input/requirements.txt:/input/requirements.txt \
-v /build/output:/package \
"$image_tag"
else
docker run --rm \
--platform "$platform" \
-v /build/input/requirements.txt:/input/requirements.txt \
-v /build/output:/package \
"$image_tag"
fi

echo "$(date): Build complete for $arch"
}}

# --- Execute builds ---
for arch in {arches_str}; do
build_arch "$arch"
done

# --- Upload artifacts to S3 ---
echo ""
echo "$(date): Uploading artifacts to S3..."
S3_KEYS=""
FILE_COUNT=0

for zip_file in /build/output/*.zip; do
if [ -f "$zip_file" ]; then
filename=$(basename "$zip_file")
s3_key="builds/$BUILD_ID/$filename"
aws s3 cp "$zip_file" "s3://$S3_BUCKET/$s3_key"
echo "$(date): Uploaded: s3://$S3_BUCKET/$s3_key ($(du -h "$zip_file" | cut -f1))"

if [ -n "$S3_KEYS" ]; then
S3_KEYS="$S3_KEYS,$s3_key"
else
S3_KEYS="$s3_key"
fi
FILE_COUNT=$((FILE_COUNT + 1))
fi
done

if [ "$FILE_COUNT" -eq 0 ]; then
echo "$(date): ERROR - No zip files produced!"
aws dynamodb update-item \
--table-name "{TABLE_NAME}" \
--key '{{"buildId": {{"S": "{build_id}"}}}}' \
--update-expression "SET #s = :s, error_message = :e" \
--expression-attribute-names '{{"#s": "status"}}' \
--expression-attribute-values '{{":s": {{"S": "FAILED"}}, ":e": {{"S": "Build produced no output files"}}}}'
exit 1
fi

# --- Update DynamoDB with completion ---
COMPLETED_AT=$(date +%s)
aws dynamodb update-item \
--table-name "{TABLE_NAME}" \
--key '{{"buildId": {{"S": "{build_id}"}}}}' \
--update-expression "SET #s = :s, s3_keys = :k, completed_at = :t, file_count = :fc" \
--expression-attribute-names '{{"#s": "status"}}' \
--expression-attribute-values '{{":s": {{"S": "COMPLETED"}}, ":k": {{"S": "'"$S3_KEYS"'"}}, ":t": {{"N": "'"$COMPLETED_AT"'"}}, ":fc": {{"N": "'"$FILE_COUNT"'"}}}}'

echo ""
echo "$(date): ========================================="
echo "$(date): Build completed successfully!"
echo "$(date): Files: $FILE_COUNT"
echo "$(date): S3 Keys: $S3_KEYS"
echo "$(date): ========================================="

# Instance will self-terminate via the EXIT trap
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current method of embedding the requirements content directly into the user-data script using a cat heredoc is vulnerable to command injection. If the requirements content includes a line that is exactly REQUIREMENTS_EOF, it will prematurely terminate the heredoc, allowing subsequent lines to be executed as shell commands on the EC2 instance.

To mitigate this critical security risk, you should Base64-encode the requirements content within the Lambda function and decode it within the user-data script. This ensures the user-provided content is treated as data and not executable code.

def _generate_user_data(build_id, python_version, architectures, requirements, single_file):
    """Generate the EC2 user-data bash script for the build."""
    arches_str = " ".join(architectures)
    single_file_str = "true" if single_file else "false"
    requirements_b64 = base64.b64encode(requirements.encode('utf-8')).decode('utf-8')

    return f"""#!/bin/bash
set -euo pipefail
exec > >(tee /var/log/build.log) 2>&1

echo "$(date): === Lambda Layer Builder ==="
echo "Build ID: {build_id}"
echo "Python: {python_version}"
echo "Architectures: {arches_str}"
echo "Single file: {single_file_str}"

# --- Instance metadata (IMDSv2) ---
TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 300")
REGION=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/placement/region)
INSTANCE_ID=$(curl -s -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id)

export AWS_DEFAULT_REGION="$REGION"

# --- Safety: auto-terminate after {MAX_BUILD_MINUTES} minutes ---
(sleep {MAX_BUILD_MINUTES * 60} && \
  echo "$(date): TIMEOUT - self-terminating" && \
  aws dynamodb update-item \
    --table-name "{TABLE_NAME}" \
    --key '{{\"buildId\": {{\"S\": \"{build_id}\"}}}}' \
    --update-expression "SET #s = :s, error_message = :e" \
    --expression-attribute-names '{{\"#s\": \"status\"}}' \
    --expression-attribute-values '{{\":s\": {{\"S\": \"FAILED\"}}, \":e\": {{\"S\": \"Build timed out after {MAX_BUILD_MINUTES} minutes\"}}}}' && \
  aws ec2 terminate-instances --instance-ids "$INSTANCE_ID") &
WATCHDOG_PID=$!

# --- Helper functions ---
update_status() {{
    local status=$1
    local extra="${{2:-}}"
    aws dynamodb update-item \
        --table-name "{TABLE_NAME}" \
        --key '{{\"buildId\": {{\"S\": \"{build_id}\"}}}}' \
        --update-expression "SET #s = :s${{extra}}" \
        --expression-attribute-names '{{\"#s\": \"status\"}}' \
        --expression-attribute-values "$(echo '{{\":s\": {{\"S\": \"'"$status"'\"}}}}' )" \
        2>/dev/null || true
}}

cleanup() {{
    echo "$(date): Cleanup initiated"
    kill $WATCHDOG_PID 2>/dev/null || true
    echo "$(date): Self-terminating instance $INSTANCE_ID"
    aws ec2 terminate-instances --instance-ids "$INSTANCE_ID" 2>/dev/null || true
}}
trap cleanup EXIT

# --- Install Docker ---
echo "$(date): Installing Docker..."
dnf install -y docker git aws-cli 2>/dev/null || yum install -y docker git aws-cli
systemctl start docker
systemctl enable docker

# Enable QEMU for cross-architecture builds
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes 2>/dev/null || true

# --- Create requirements file ---
mkdir -p /build/input /build/output
echo "{requirements_b64}" | base64 --decode > /build/input/requirements.txt

echo "$(date): Requirements:"
cat /build/input/requirements.txt

# --- Configuration ---
DOCKER_IMAGE_PREFIX="{DOCKER_IMAGE_PREFIX}"
S3_BUCKET="{S3_BUCKET}"
BUILD_ID="{build_id}"
PYTHON_VERSION="{python_version}"
SINGLE_FILE="{single_file_str}"

# --- Build function ---
build_arch() {{
    local arch=$1
    local platform=""
    local arch_label=""

    if [ "$arch" = "x86_64" ]; then
        platform="linux/amd64"
        arch_label="amd64"
    else
        platform="linux/arm64"
        arch_label="arm64"
    fi

    echo ""
    echo "$(date): ========================================"
    echo "$(date): Building for $arch ($platform)"
    echo "$(date): ========================================"

    local image_tag="${{DOCKER_IMAGE_PREFIX}}:python${{PYTHON_VERSION}}-${{arch_label}}-latest"

    # Try pre-built image first, fall back to local build
    if docker pull --platform "$platform" "$image_tag" 2>/dev/null; then
        echo "$(date): Using pre-built image: $image_tag"
    else
        echo "$(date): Pre-built image unavailable, building locally..."

        if [ ! -d /build/repo ]; then
            git clone {GITHUB_REPO_URL} /build/repo
        fi

        # Select correct Dockerfile based on Python version
        local dockerfile="/build/repo/Dockerfile.al2023"
        if [[ "$PYTHON_VERSION" == "3.10" || "$PYTHON_VERSION" == "3.11" ]]; then
            dockerfile="/build/repo/Dockerfile.al2"
        fi

        docker buildx create --use --name builder 2>/dev/null || true
        docker buildx build \
            --platform "$platform" \
            --build-arg PYTHON_VERSION=$PYTHON_VERSION \
            -t "$image_tag" \
            --load \
            -f "$dockerfile" \
            /build/repo/
    fi

    # Run the build container
    if [ "$SINGLE_FILE" = "true" ]; then
        docker run --rm \
            --platform "$platform" \
            -e SINGLE_FILE=true \
            -v /build/input/requirements.txt:/input/requirements.txt \
            -v /build/output:/package \
            "$image_tag"
    else
        docker run --rm \
            --platform "$platform" \
            -v /build/input/requirements.txt:/input/requirements.txt \
            -v /build/output:/package \
            "$image_tag"
    fi

    echo "$(date): Build complete for $arch"
}}

# --- Execute builds ---
for arch in {arches_str}; do
    build_arch "$arch"
done

# --- Upload artifacts to S3 ---
echo ""
echo "$(date): Uploading artifacts to S3..."
S3_KEYS=""
FILE_COUNT=0

for zip_file in /build/output/*.zip; do
    if [ -f "$zip_file" ]; then
        filename=$(basename "$zip_file")
        s3_key="builds/$BUILD_ID/$filename"
        aws s3 cp "$zip_file" "s3://$S3_BUCKET/$s3_key"
        echo "$(date): Uploaded: s3://$S3_BUCKET/$s3_key ($(du -h "$zip_file" | cut -f1))"

        if [ -n "$S3_KEYS" ]; then
            S3_KEYS="$S3_KEYS,$s3_key"
        else
            S3_KEYS="$s3_key"
        fi
        FILE_COUNT=$((FILE_COUNT + 1))
    fi
done

if [ "$FILE_COUNT" -eq 0 ]; then
    echo "$(date): ERROR - No zip files produced!"
    aws dynamodb update-item \
        --table-name "{TABLE_NAME}" \
        --key '{{\"buildId\": {{\"S\": \"{build_id}\"}}}}' \
        --update-expression "SET #s = :s, error_message = :e" \
        --expression-attribute-names '{{\"#s\": \"status\"}}' \
        --expression-attribute-values '{{\":s\": {{\"S\": \"FAILED\"}}, \":e\": {{\"S\": \"Build produced no output files\"}}}}'
    exit 1
fi

# --- Update DynamoDB with completion ---
COMPLETED_AT=$(date +%s)
aws dynamodb update-item \
    --table-name "{TABLE_NAME}" \
    --key '{{\"buildId\": {{\"S\": \"{build_id}\"}}}}' \
    --update-expression "SET #s = :s, s3_keys = :k, completed_at = :t, file_count = :fc" \
    --expression-attribute-names '{{\"#s\": \"status\"}}' \
    --expression-attribute-values '{{\":s\": {{\"S\": \"COMPLETED\"}}, \":k\": {{\"S\": \"'"$S3_KEYS"'\"}}, \":t\": {{\"N\": \"'"$COMPLETED_AT"'\"}}, \":fc\": {{\"N\": \"'"$FILE_COUNT"'\"}}}}'

echo ""
echo "$(date): ========================================"
echo "$(date): Build completed successfully!"
echo "$(date): Files: $FILE_COUNT"
echo "$(date): S3 Keys: $S3_KEYS"
echo "$(date): ========================================"

# Instance will self-terminate via the EXIT trap
"""

Comment on lines +10 to +28
data "aws_ami" "al2023" {
most_recent = true
owners = ["amazon"]

filter {
name = "name"
values = ["al2023-ami-*-x86_64"]
}

filter {
name = "virtualization-type"
values = ["hvm"]
}

filter {
name = "architecture"
values = ["x86_64"]
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The data "aws_ami" "al2023" block is hardcoded to find an x86_64 AMI. This AMI is then used in the aws_launch_template, which means all build instances will be x86_64. While cross-compiling for arm64 on an x86_64 host with QEMU is possible (as the user-data script does), it is significantly slower and can lead to subtle issues compared to native compilation.

To improve performance and reliability for arm64 builds, you should create a separate launch template using a native arm64 AMI and have the process_build Lambda select the appropriate template based on the requested architecture. This would likely involve launching separate instances for each architecture in a multi-arch build.

Comment on lines +796 to +808
function toggleArch(el, value) {
const cb = el.querySelector('input');
// Will be toggled by browser after this handler, so check current state
// Actually, since it's a label wrapping the input, the click toggles it.
// We use setTimeout to read the new state.
setTimeout(() => {
if (cb.checked) {
el.classList.add('selected');
} else {
el.classList.remove('selected');
}
}, 0);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The toggleArch function uses setTimeout(..., 0) to handle checkbox state changes. This is a fragile pattern that relies on the event loop timing and can be hard to reason about. A more robust and standard approach is to attach a change event listener directly to the checkboxes when the DOM is loaded, rather than using onclick attributes in the HTML.

    function toggleArch(el) {
      if (el.querySelector('input').checked) {
        el.classList.add('selected');
      } else {
        el.classList.remove('selected');
      }
    }

    // In your DOMContentLoaded listener, you could replace the onclick attributes with this:
    /*
    document.querySelectorAll('.checkbox-item').forEach(label => {
      const input = label.querySelector('input');
      input.addEventListener('change', () => {
        if (input.checked) {
          label.classList.add('selected');
        } else {
          label.classList.remove('selected');
        }
      });
    });
    */

Comment on lines +50 to +52
uuid_pattern = re.compile(
r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", re.I
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expression for UUID validation is compiled inside the handler function. This means it gets recompiled on every invocation of the Lambda function, which is inefficient. For better performance, you should define the compiled regex pattern at the module level so it is compiled only once when the Lambda execution environment is initialized.

import re

uuid_pattern = re.compile(
    r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$", re.I
)


# Generate presigned download URLs for completed builds
if item["status"] == "COMPLETED" and item.get("s3_keys"):
s3_keys = item["s3_keys"].split(",")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Storing s3_keys as a comma-separated string in DynamoDB is fragile. If a filename were to ever contain a comma, it would break the parsing logic here. A more robust approach is to use a data type that natively supports lists, such as a DynamoDB String Set (SS) or a JSON-encoded list string. This would require updating the process_build user-data script to store the keys in the new format.

        s3_keys = json.loads(item.get("s3_keys", "[]"))

"statusCode": status_code,
"headers": {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Access-Control-Allow-Origin header is hardcoded to *. While your API Gateway configuration allows for specific origins via the allowed_origins variable, this hardcoded value in the Lambda response will always be sent. This is a security concern as it is overly permissive and prevents the use of credentials with requests. The allowed origins should be configurable and validated against the request's Origin header.

            "Access-Control-Allow-Origin": os.environ.get("ALLOWED_ORIGIN", "*"),

Comment on lines +101 to +108
return _response(200, {
"build_id": build_id,
"status": "QUEUED",
"python_version": python_version,
"architectures": architectures,
"single_file": single_file,
"expires_at": expires_at,
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The API response for a new build submission is missing the created_at field, even though it's being stored in the DynamoDB record. The frontend code currently falls back to using the client's local time (Date.now()), which can be inaccurate and lead to inconsistent data display for the user. The API should always be the single source of truth for created items.

Suggested change
return _response(200, {
"build_id": build_id,
"status": "QUEUED",
"python_version": python_version,
"architectures": architectures,
"single_file": single_file,
"expires_at": expires_at,
})
return _response(200, {
"build_id": build_id,
"status": "QUEUED",
"python_version": python_version,
"architectures": architectures,
"single_file": single_file,
"created_at": now,
"expires_at": expires_at,
})

"statusCode": status_code,
"headers": {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Access-Control-Allow-Origin header is hardcoded to *. While your API Gateway configuration allows for specific origins via the allowed_origins variable, this hardcoded value in the Lambda response will always be sent. This is a security concern as it is overly permissive and prevents the use of credentials with requests. The allowed origins should be configurable and validated against the request's Origin header.

Suggested change
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Origin": os.environ.get("ALLOWED_ORIGIN", "*"),

ec2_instance_type = "c5.xlarge"

# EBS volume size in GB (20-200)
ec2_volume_size = 20

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The example value for ec2_volume_size is set to 20 here, but the default value defined in variables.tf is 50 GB, with a comment recommending 50GB. This inconsistency can be confusing for users setting up the project. It's best to keep the example file aligned with the recommended defaults.

ec2_volume_size = 50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant