Skip to content

fix(www): SEO/AEO audit improvements#77

Open
iamdin wants to merge 1 commit intonextfrom
seo-aeo-audit
Open

fix(www): SEO/AEO audit improvements#77
iamdin wants to merge 1 commit intonextfrom
seo-aeo-audit

Conversation

@iamdin
Copy link
Member

@iamdin iamdin commented Mar 4, 2026

SEO/AEO Audit Improvements

Comprehensive SEO and AEO (Answer Engine Optimization) audit for the documentation site (apps/www).

Changes

Critical fixes:

  • Add robots.txt with sitemap reference and crawl directives
  • Add Organization JSON-LD structured data site-wide (via new StructuredData component)
  • Fix heading hierarchy on homepage — <h3><h2> to avoid skipping heading levels

High priority:

  • Add noindex,nofollow to 404 page (was being indexed with index,follow)
  • Fix og:type from "website" to "article" on component documentation pages
  • Add aria-hidden="true" to decorative SVG logos on landing page

Medium priority:

  • Add BreadcrumbList JSON-LD structured data to doc and component pages
  • New reusable StructuredData Astro component for JSON-LD injection
  • New robotsMeta and structuredData props on RootLayout

What was already good

  • <html lang="en">, viewport, canonical URLs, meta descriptions, OG/Twitter cards
  • @astrojs/sitemap integration with <link rel="sitemap">
  • HTTPS via Cloudflare, favicons, dynamic OG images
  • sr-only text for icon-only buttons

Verification

  • pnpm run typecheck --filter=./apps/www passes with 0 errors, 0 warnings, 0 hints
  • pnpm run format passes
  • pnpm run lint passes (note: oxfmt in lint-staged has a pre-existing issue with .astro files)

This PR was generated with Oz.

Summary by CodeRabbit

  • New Features

    • Added breadcrumb navigation throughout documentation for improved site navigation and search engine discoverability.
    • Configured search engine optimization with site crawling guidelines and structured metadata.
  • Accessibility

    • Enhanced decorative icons with accessibility attributes for improved screen reader compatibility.
  • Refactor

    • Improved semantic HTML heading hierarchy across pages.

- Add robots.txt with sitemap reference
- Add Organization JSON-LD structured data site-wide
- Add BreadcrumbList JSON-LD to doc and component pages
- Fix heading hierarchy on homepage (h3 → h2)
- Add noindex to 404 page
- Fix og:type from website to article on component pages
- Add aria-hidden to decorative SVGs on landing page
- Support robotsMeta and structuredData props in root layout

Co-Authored-By: Oz <oz-agent@warp.dev>
@vercel
Copy link

vercel bot commented Mar 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
shipbase-ui Error Error Mar 4, 2026 1:25am

@changeset-bot
Copy link

changeset-bot bot commented Mar 4, 2026

⚠️ No Changeset found

Latest commit: 06878f2

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link

coderabbitai bot commented Mar 4, 2026

📝 Walkthrough

Walkthrough

This PR enhances SEO capabilities and accessibility across the website by introducing a robots.txt configuration file, creating a reusable structured data component, integrating JSON-LD schemas (organization and breadcrumbs) into the layout, and making semantic HTML and accessibility improvements to various components.

Changes

Cohort / File(s) Summary
SEO & Robots Configuration
apps/www/public/robots.txt, apps/www/src/components/structured-data.astro
Introduces robots.txt file with standard allow/disallow rules and sitemap reference. New StructuredData component handles rendering JSON-LD schemas with support for single or multiple data items.
Layout & Root-Level SEO Integration
apps/www/src/layouts/root-layout.astro
Integrates StructuredData component for organization schema and optional additional structured data. Adds robotsMeta prop to control indexing behavior; defaults to "index,follow".
Page-Level Robots & Structured Data
apps/www/src/pages/404.astro, apps/www/src/pages/docs/[...slug].astro, apps/www/src/pages/docs/components/[component].astro
404 page marked with noindex,nofollow. Documentation pages enhanced with BreadcrumbList structured data for improved navigation SEO. Component pages also change OpenGraph type from website to article.
Semantic HTML & Accessibility
apps/www/src/components/landing/components-overview.astro, apps/www/src/components/landing/features.astro, apps/www/src/components/landing/shadcn.astro, apps/www/src/components/landing/cta.astro
Upgraded heading hierarchy from h3 to h2 in three components. Added aria-hidden="true" to decorative icon elements in CTA component to improve accessibility.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

🐰 Hop! Hop! SEO's here,
Robots welcome, structured clear,
Breadcrumbs guide both bots and kin,
Accessibility wins, SEO wins,
The website now stands proud and bright! 🌟

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(www): SEO/AEO audit improvements' directly and clearly summarizes the main change: implementing SEO and Answer Engine Optimization improvements for the www application, which aligns perfectly with the comprehensive audit work across multiple files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch seo-aeo-audit

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/www/src/components/landing/cta.astro (1)

62-100: ⚠️ Potential issue | 🟡 Minor

Complete decorative icon masking for the GitHub button icons.

Great improvement here. For full consistency, also add aria-hidden="true" to GitHubLight (Line 37) and GitHubDark (Line 43), since the button already has visible text.

♿ Suggested patch
         <GitHubLight
           width="16"
           height="16"
           fill="currentColor"
+          aria-hidden="true"
           class="dark:hidden"
         />
         <GitHubDark
           width="16"
           height="16"
           fill="currentColor"
+          aria-hidden="true"
           class="hidden dark:block"
         />
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/www/src/components/landing/cta.astro` around lines 62 - 100, The
GitHubLight and GitHubDark icon components are missing aria-hidden="true", which
breaks decorative-icon accessibility consistency; update the JSX/AST where
GitHubLight and GitHubDark are used (the GitHub button icon instances) to
include aria-hidden="true" on both components so screen readers ignore the
decorative icons while the visible button text remains accessible.
🧹 Nitpick comments (1)
apps/www/src/pages/docs/components/[component].astro (1)

37-66: Consider extracting breadcrumb JSON-LD construction into a shared helper.

This block is very similar to the one in apps/www/src/pages/docs/[...slug].astro (Lines 30-53). A shared builder would reduce duplication and prevent schema drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/www/src/pages/docs/components/`[component].astro around lines 37 - 66,
Extract the inline BreadcrumbList JSON-LD into a shared helper function (e.g.,
buildBreadcrumbList or createBreadcrumbStructuredData) that accepts the URL
object and the entry (or title) and returns the structuredData object used here
(referencing url, url.origin, url.toString(), and entry.data.title); implement
the helper in a common utilities module and export it, then replace the inline
structuredData block in this component and the similar block in the other
component with a call to the new helper and import the function, ensuring the
resulting object shape matches the original schema.org BreadcrumbList keys.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/www/public/robots.txt`:
- Around line 1-6: There are two conflicting robots.txt sources: the static file
and the dynamic robots.txt route; update the dynamic route (robots.txt.ts) to
include the missing "Disallow: /api/" directive and ensure its sitemap
generation uses the environment-aware URL logic currently in the route, then
delete the static public robots.txt so the dynamic route becomes the single
source of truth.

In `@apps/www/src/components/structured-data.astro`:
- Around line 10-11: The JSON-LD is being injected with set:html using
JSON.stringify(item) which can be broken by raw "</script>" or special line
separator characters; before passing to set:html (in the items.map render)
produce a sanitized string from JSON.stringify(item) that replaces any
"</script>" occurrences with "<\/script>" and also escapes U+2028 and U+2029
(e.g., replace those codepoints with their escaped forms) so the script tag
cannot be terminated early; update the mapping that currently calls
JSON.stringify(item) to use this sanitized string instead.

---

Outside diff comments:
In `@apps/www/src/components/landing/cta.astro`:
- Around line 62-100: The GitHubLight and GitHubDark icon components are missing
aria-hidden="true", which breaks decorative-icon accessibility consistency;
update the JSX/AST where GitHubLight and GitHubDark are used (the GitHub button
icon instances) to include aria-hidden="true" on both components so screen
readers ignore the decorative icons while the visible button text remains
accessible.

---

Nitpick comments:
In `@apps/www/src/pages/docs/components/`[component].astro:
- Around line 37-66: Extract the inline BreadcrumbList JSON-LD into a shared
helper function (e.g., buildBreadcrumbList or createBreadcrumbStructuredData)
that accepts the URL object and the entry (or title) and returns the
structuredData object used here (referencing url, url.origin, url.toString(),
and entry.data.title); implement the helper in a common utilities module and
export it, then replace the inline structuredData block in this component and
the similar block in the other component with a call to the new helper and
import the function, ensuring the resulting object shape matches the original
schema.org BreadcrumbList keys.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 148199a and 06878f2.

📒 Files selected for processing (10)
  • apps/www/public/robots.txt
  • apps/www/src/components/landing/components-overview.astro
  • apps/www/src/components/landing/cta.astro
  • apps/www/src/components/landing/features.astro
  • apps/www/src/components/landing/shadcn.astro
  • apps/www/src/components/structured-data.astro
  • apps/www/src/layouts/root-layout.astro
  • apps/www/src/pages/404.astro
  • apps/www/src/pages/docs/[...slug].astro
  • apps/www/src/pages/docs/components/[component].astro

Comment on lines +1 to +6
User-agent: *
Allow: /

Disallow: /api/

Sitemap: https://ui.shipbase.xyz/sitemap-index.xml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify duplicate robots.txt sources and compare their content.
fd 'robots\.txt(\.ts)?$' apps/www
echo "----- apps/www/public/robots.txt -----"
cat apps/www/public/robots.txt 2>/dev/null || echo "[File not found]"
echo "----- apps/www/src/pages/robots.txt.ts -----"
cat apps/www/src/pages/robots.txt.ts 2>/dev/null || echo "[File not found]"

Repository: shipbase/ui

Length of output: 580


Consolidate robots.txt to one source of truth.

Both apps/www/public/robots.txt and apps/www/src/pages/robots.txt.ts exist. In Astro, the static file takes precedence, masking the dynamic route. The static file includes Disallow: /api/ and a hardcoded sitemap URL, while the route lacks the API disallow directive and uses environment-aware sitemap generation.

To resolve: add Disallow: /api/ to apps/www/src/pages/robots.txt.ts, then remove the static apps/www/public/robots.txt.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/www/public/robots.txt` around lines 1 - 6, There are two conflicting
robots.txt sources: the static file and the dynamic robots.txt route; update the
dynamic route (robots.txt.ts) to include the missing "Disallow: /api/" directive
and ensure its sitemap generation uses the environment-aware URL logic currently
in the route, then delete the static public robots.txt so the dynamic route
becomes the single source of truth.

Comment on lines +10 to +11
{items.map((item) => (
<script type="application/ld+json" set:html={JSON.stringify(item)} />
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the target file
find . -name "structured-data.astro" -type f 2>/dev/null

Repository: shipbase/ui

Length of output: 102


🏁 Script executed:

# Check file structure and content if it exists
cat -n apps/www/src/components/structured-data.astro 2>/dev/null || echo "File not found in expected location"

Repository: shipbase/ui

Length of output: 413


🏁 Script executed:

# Search for similar JSON-LD or set:html usage patterns in the codebase
rg "set:html.*JSON|JSON.*set:html" --type astro -A 2 -B 2 2>/dev/null || echo "No similar patterns found"

Repository: shipbase/ui

Length of output: 80


🌐 Web query:

Astro set:html directive HTML escaping security XSS prevention

💡 Result:

  • In Astro, normal template expressions like <h1>{value}</h1> HTML-escape by default, which is part of Astro’s “automatic XSS protection” work. If value contains <strong>, it renders as text (&lt;strong&gt;). [2]
  • set:html={value} is the explicit “dangerous” escape hatch: it injects the string as raw HTML (like element.innerHTML) and Astro does not escape it. Only use it with content you fully trust or have sanitized yourself, or you risk XSS. [1][2]

Practical XSS-prevention guidance

  • Treat set:html exactly like innerHTML: never pass untrusted/user-controlled strings unless you sanitize them first (e.g., with DOMPurify). OWASP explicitly advises avoiding innerHTML with untrusted data, and to sanitize if you must render user-authored HTML. [5]
  • Prefer safe alternatives when you can:
    • render as text: {value} or set:text={value} (Astro escapes automatically). [1]
  • Add guardrails:
    • Use an ESLint rule that disallows set:html to reduce accidental XSS risk (and selectively disable it only where you’ve proven sanitization). [3]
    • Consider enabling Astro’s experimental CSP support for defense-in-depth against script injection. [4]

Key takeaway: set:html prevents escaping (by design), so XSS prevention is on you: only trusted/sanitized HTML + defense-in-depth. [1][5]


Escape JSON-LD output before set:html to prevent script-breakout injection.

The component accepts untrusted data via the data prop and directly stringifies it. JSON.stringify(item) does not escape </script> sequences—if data contains </script> as a string value, it will terminate the script tag early and enable script injection. Since Astro's set:html is a no-escape directive (by design), sanitization is your responsibility.

🔒 Suggested hardening patch
 const { data } = Astro.props
 const items = Array.isArray(data) ? data : [data]
+const toSafeJsonLd = (value: Record<string, unknown>) =>
+  JSON.stringify(value)
+    .replace(/</g, "\\u003c")
+    .replace(/>/g, "\\u003e")
+    .replace(/&/g, "\\u0026")
+    .replace(/\u2028/g, "\\u2028")
+    .replace(/\u2029/g, "\\u2029")
 ---
 
 {items.map((item) => (
-  <script type="application/ld+json" set:html={JSON.stringify(item)} />
+  <script type="application/ld+json" set:html={toSafeJsonLd(item)} />
 ))}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/www/src/components/structured-data.astro` around lines 10 - 11, The
JSON-LD is being injected with set:html using JSON.stringify(item) which can be
broken by raw "</script>" or special line separator characters; before passing
to set:html (in the items.map render) produce a sanitized string from
JSON.stringify(item) that replaces any "</script>" occurrences with "<\/script>"
and also escapes U+2028 and U+2029 (e.g., replace those codepoints with their
escaped forms) so the script tag cannot be terminated early; update the mapping
that currently calls JSON.stringify(item) to use this sanitized string instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant