Skip to content

Cloud: Add tip for Iceberg object_storage example with BigQuery#464

Merged
kbatuigas merged 2 commits intomainfrom
DOC-1306-cloud-bigquery
Nov 27, 2025
Merged

Cloud: Add tip for Iceberg object_storage example with BigQuery#464
kbatuigas merged 2 commits intomainfrom
DOC-1306-cloud-bigquery

Conversation

@kbatuigas
Copy link
Contributor

@kbatuigas kbatuigas commented Nov 26, 2025

Description

PR for single sourcing: redpanda-data/docs#1494

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline:

Page previews

Use Iceberg Catalogs > Integrate filesystem-based catalog
Integrate with REST Catalogs index page

Checks

  • New feature
  • Content gap
  • Support Follow-up
  • Small fix (typos, links, copyedits, etc)

@kbatuigas kbatuigas requested a review from a team as a code owner November 26, 2025 22:16
@netlify
Copy link

netlify bot commented Nov 26, 2025

Deploy Preview for rp-cloud ready!

Name Link
🔨 Latest commit a7092d5
🔍 Latest deploy log https://app.netlify.com/projects/rp-cloud/deploys/6928a4451e57650008fdf19b
😎 Deploy Preview https://deploy-preview-464--rp-cloud.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 26, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This pull request makes two documentation-related changes for the Iceberg + BigQuery integration feature. The local Antora playbook is updated to use a feature-specific branch (DOC-1306-iceberg-bigquery-integration) instead of the main branch for documentation sources, while preserving other branch patterns. Additionally, the REST catalog index documentation page restores its page-layout configuration and adds a TIP block advising users to utilize REST catalogs for production environments with a reference to filesystem-based catalog alternatives.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • local-antora-playbook.yml: Straightforward branch substitution in the source configuration; verify the branch name is correct and consistent with the feature work.
  • rest-catalog/index.adoc: Documentation content reinstatement and addition; confirm the TIP block content is accurate and well-formatted.

Possibly related PRs

Suggested reviewers

  • mattschumpert
  • simon0191
  • paulohtb6

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding a tip for Iceberg object_storage example with BigQuery integration.
Linked Issues check ✅ Passed The PR addresses DOC-1306 by adding documentation guidance for Iceberg with BigQuery, consistent with the objective to create usage documentation for this integration.
Out of Scope Changes check ✅ Passed All changes are directly related to the DOC-1306 objective: the playbook branch update targets the specific feature branch, and the documentation page adds the required tip for BigQuery usage.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description follows the required template structure with issue reference placeholder, page previews section, and a checklist. However, the issue number placeholder is not filled in, and the review deadline is blank.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 853aa9d and de01a66.

📒 Files selected for processing (2)
  • local-antora-playbook.yml (1 hunks)
  • modules/manage/pages/iceberg/rest-catalog/index.adoc (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: micheleRP
Repo: redpanda-data/cloud-docs PR: 390
File: modules/manage/pages/schema-reg/schema-reg-authorization.adoc:4-4
Timestamp: 2025-08-15T04:45:28.695Z
Learning: In the Redpanda documentation system, content is single-sourced across multiple repositories (cloud-docs and docs repos). Include directives in the cloud-docs repo may reference files that exist in separate PRs in the docs repo. These PRs are linked via local-antora-playbook.yml for preview rendering, and the includes resolve correctly during the Antora build process when repositories are merged. The playbook is reverted to main before merging. This cross-repository single sourcing pattern is commonly used, so missing include targets should be verified against linked PRs in other repositories before flagging as errors.
📚 Learning: 2025-08-15T04:45:28.695Z
Learnt from: micheleRP
Repo: redpanda-data/cloud-docs PR: 390
File: modules/manage/pages/schema-reg/schema-reg-authorization.adoc:4-4
Timestamp: 2025-08-15T04:45:28.695Z
Learning: In the Redpanda documentation system, content is single-sourced across multiple repositories (cloud-docs and docs repos). Include directives in the cloud-docs repo may reference files that exist in separate PRs in the docs repo. These PRs are linked via local-antora-playbook.yml for preview rendering, and the includes resolve correctly during the Antora build process when repositories are merged. The playbook is reverted to main before merging. This cross-repository single sourcing pattern is commonly used, so missing include targets should be verified against linked PRs in other repositories before flagging as errors.

Applied to files:

  • local-antora-playbook.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Redirect rules - rp-cloud
  • GitHub Check: Header rules - rp-cloud
  • GitHub Check: Pages changed - rp-cloud
🔇 Additional comments (1)
local-antora-playbook.yml (1)

18-18: Temporary playbook change for cross-repo single-sourcing — remember to revert before merge.

The branch configuration now targets the feature branch DOC-1306-iceberg-bigquery-integration to coordinate with the linked docs repo PR #1494. This is the correct approach for the preview phase. However, ensure the playbook is reverted to branches: main before the PR is merged, as per standard practice for this repository.

Based on learnings, this playbook change is temporary and standard for cross-repo single-sourcing efforts. Verify it will be reverted before final merge.

Comment on lines +3 to +5
:page-layout: index

TIP: These guides are for integrating Iceberg topics with managed REST catalogs. Integrating with a REST catalog is recommended for production deployments. If it is not possible to use a REST catalog, you can use the xref:manage:iceberg/use-iceberg-catalogs.adoc#object-storage[filesystem-based catalog]. For an example of using the filesystem-based catalog to access Iceberg topics, see the https://www.redpanda.com/blog/iceberg-topics-redpanda-cloud-byoc-setup[Getting Started with Iceberg Topics on Redpanda BYOC^] blog post. No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Script executed:

#!/bin/bash
# Search for the use-iceberg-catalogs.adoc file
echo "=== Searching for use-iceberg-catalogs.adoc ==="
fd -t f "use-iceberg-catalogs.adoc"

# Check for the file and look for object-storage anchor
echo -e "\n=== Checking for [[object-storage]] anchor ==="
fd -t f "use-iceberg-catalogs.adoc" --exec rg -n "\[\[object-storage\]\]|== .*[Oo]bject.*[Ss]torage|=== .*[Oo]bject.*[Ss]torage" {} +

Repository: redpanda-data/cloud-docs


<shell_result>
Repository: redpanda-data/cloud-docs
Exit code: 1

stdout:

=== Searching for use-iceberg-catalogs.adoc ===
modules/manage/pages/iceberg/use-iceberg-catalogs.adoc

=== Checking for [[object-storage]] anchor ===

stderr:

+: No such file or directory (os error 2)

</shell_result>


Script executed:

#!/bin/bash
# Search the file directly for object-storage anchor
echo "=== Checking for [[object-storage]] anchor or equivalent ==="
rg -n "\[\[object-storage\]\]|== .*[Oo]bject.*[Ss]torage|=== .*[Oo]bject.*[Ss]torage" modules/manage/pages/iceberg/use-iceberg-catalogs.adoc

echo -e "\n=== File content (first 100 lines) ==="
head -100 modules/manage/pages/iceberg/use-iceberg-catalogs.adoc

Repository: redpanda-data/cloud-docs


<shell_result>
Repository: redpanda-data/cloud-docs
Exit code: 0

stdout:

=== Checking for [[object-storage]] anchor or equivalent ===

=== File content (first 100 lines) ===
= Use Iceberg Catalogs
:description: Learn how to access Redpanda topic data stored in Iceberg tables, using table metadata or a catalog integration.
:page-categories: Iceberg, Tiered Storage, Management, High Availability, Data Replication, Integration

include::ROOT:manage:iceberg/use-iceberg-catalogs.adoc[tag=single-source]

</shell_result>


Script executed:

#!/bin/bash
# Search for object-storage anchor across all adoc files
echo "=== Searching for [[object-storage]] anchor across all .adoc files ==="
rg -n "\[\[object-storage\]\]" --type adoc

echo -e "\n=== Searching for object-storage section headers ==="
rg -n "== .*[Oo]bject" --type adoc | head -20

echo -e "\n=== Full content of use-iceberg-catalogs.adoc ==="
wc -l modules/manage/pages/iceberg/use-iceberg-catalogs.adoc
cat -n modules/manage/pages/iceberg/use-iceberg-catalogs.adoc

Repository: redpanda-data/cloud-docs


<shell_result>
Repository: redpanda-data/cloud-docs
Exit code: 0

stdout:

=== Searching for [[object-storage]] anchor across all .adoc files ===

=== Searching for object-storage section headers ===

=== Full content of use-iceberg-catalogs.adoc ===
4 modules/manage/pages/iceberg/use-iceberg-catalogs.adoc
     1	= Use Iceberg Catalogs
     2	:description: Learn how to access Redpanda topic data stored in Iceberg tables, using table metadata or a catalog integration.
     3	:page-categories: Iceberg, Tiered Storage, Management, High Availability, Data Replication, Integration
     4	
     5	include::ROOT:manage:iceberg/use-iceberg-catalogs.adoc[tag=single-source]

stderr:

unrecognized file type: adoc
unrecognized file type: adoc

</shell_result>


Script executed:

#!/bin/bash
# Search for object-storage anchor without type filter
echo "=== Searching for [[object-storage]] anchor ==="
rg -n "\[\[object-storage\]\]"

echo -e "\n=== Searching for object-storage in any context ==="
rg -n "object-storage" | head -30

echo -e "\n=== Looking for the included file in ROOT module ==="
fd "use-iceberg-catalogs.adoc"

Repository: redpanda-data/cloud-docs


<shell_result>
Repository: redpanda-data/cloud-docs
Exit code: 0

stdout:

=== Searching for [[object-storage]] anchor ===

=== Searching for object-storage in any context ===
modules/manage/pages/cluster-maintenance/config-cluster.adoc:6:For a complete list of the cluster properties available in Redpanda Cloud, see xref:reference:properties/cluster-properties.adoc[Cluster Configuration Properties] and xref:reference:properties/object-storage-properties.adoc[Object Storage Properties].
modules/manage/pages/iceberg/rest-catalog/index.adoc:5:TIP: These guides are for integrating Iceberg topics with managed REST catalogs. Integrating with a REST catalog is recommended for production deployments. If it is not possible to use a REST catalog, you can use the xref:manage:iceberg/use-iceberg-catalogs.adoc#object-storage[filesystem-based catalog]. For an example of using the filesystem-based catalog to access Iceberg topics, see the https://www.redpanda.com/blog/iceberg-topics-redpanda-cloud-byoc-setup[Getting Started with Iceberg Topics on Redpanda BYOC^] blog post.
modules/reference/pages/properties/object-storage-properties.adoc:11:include::ROOT:reference:partial$properties/object-storage-properties.adoc[tags=redpanda-cloud;!deprecated;!exclude-from-docs]
modules/ROOT/nav.adoc:485:*** xref:reference:properties/object-storage-properties.adoc[]
modules/get-started/pages/whats-new-cloud.adoc:226:You can now xref:manage:cluster-maintenance/config-cluster.adoc#view-cluster-property-values[view the value of read-only cluster configuration properties] with `rpk cluster config` or with the Cloud API. Available properties are listed in xref:reference:properties/cluster-properties.adoc[Cluster Properties] and xref:reference:properties/object-storage-properties.adoc[Object Storage Properties].

=== Looking for the included file in ROOT module ===
modules/manage/pages/iceberg/use-iceberg-catalogs.adoc

</shell_result>


Script executed:

#!/bin/bash
# Search for the ROOT module include source
echo "=== Looking for the source of the include directive ==="
fd "use-iceberg-catalogs.adoc" -x rg -l "object-storage"

echo -e "\n=== Search for files in ROOT module partials ==="
fd -t d "ROOT" | head -5
find . -path "*/ROOT/*" -name "*iceberg*" -o -path "*/ROOT/*" -name "*catalog*" 2>/dev/null | head -20

echo -e "\n=== Check for any anchor definitions ==="
rg "\[\[.*storage.*\]\]" | grep -i iceberg

Repository: redpanda-data/cloud-docs


<shell_result>
Repository: redpanda-data/cloud-docs
Exit code: 1

stdout:

=== Looking for the source of the include directive ===

=== Search for files in ROOT module partials ===
modules/ROOT/

=== Check for any anchor definitions ===

</shell_result>


https://www.redpanda.com/blog/iceberg-topics-redpanda-cloud-byoc-setup


<web_search_result>
Summary of "Getting started with Iceberg Topics on Redpanda BYOC" (May 13, 2025) — key points from the Redpanda blog:

  • Iceberg Topics (introduced in Redpanda 25.1) let Redpanda write Kafka topics automatically in Apache Iceberg format so downstream systems (BigQuery, Snowflake, Databricks, Athena, etc.) can query them with SQL. Benefits include data-lake interoperability, time-travel, schema evolution, ACID semantics, and large-scale performance. [1]

  • BYOC beta features: self-service cluster-level Iceberg settings via rpk or Cloud API, integration with REST catalogs (e.g., Snowflake Open Catalog) or file-based catalogs (for BigQuery), and secure credential handling for catalogs. [1]

  • GCP BYOC demo (step summary):

    1. Prereqs: Redpanda Cloud account, install/update rpk, enable Iceberg integration. [1]
    2. Create a topic and enable iceberg mode (choose mode like value_schema_id_prefix, value_schema_latest, or key_value). [1]
    3. Produce data using Schema Registry (register schema, produce with registry wire format). [1]
    4. Configure Tiered Storage to write Iceberg metadata/files to a GCS bucket. [1]
    5. Create a BigQuery external table pointing to the Iceberg metadata file in GCS (CREATE EXTERNAL TABLE ... OPTIONS(format='ICEBERG', metadata_file_paths=[...])). Query via BigQuery as usual. [1]
  • Links and further reading: Redpanda docs on Iceberg Topics, BigQuery/BigLake Iceberg guidance, and related Redpanda docs (dead letter queue, custom partitioning). [1]

Source: Redpanda blog — "Getting started with Iceberg Topics on Redpanda BYOC" (May 13, 2025). [1]

[1] https://www.redpanda.com/blog/iceberg-topics-redpanda-cloud-byoc-setup
</web_search_result>


Critical: Internal xref anchor is missing and will result in a broken link.

The external blog URL is valid and accessible, but the internal xref target contains a critical issue: the #object-storage anchor referenced in xref:manage:iceberg/use-iceberg-catalogs.adoc#object-storage does not exist in the target file. The file modules/manage/pages/iceberg/use-iceberg-catalogs.adoc exists but only contains an include directive and no [[object-storage]] anchor definition. This broken reference will cause the link to fail during documentation builds or when users click it.

The TIP block content and structure are well-written, but the internal link must be corrected before this change can be approved. Either:

  1. Add the [[object-storage]] anchor to the target file at the appropriate section, or
  2. Update the xref to point to the correct anchor location if it exists elsewhere
🤖 Prompt for AI Agents
In modules/manage/pages/iceberg/rest-catalog/index.adoc around lines 3 to 5 the
TIP contains an xref to
xref:manage:iceberg/use-iceberg-catalogs.adoc#object-storage but that anchor
does not exist in modules/manage/pages/iceberg/use-iceberg-catalogs.adoc (it
only includes another file); fix by either adding a persistent anchor
[[object-storage]] at the appropriate section in
modules/manage/pages/iceberg/use-iceberg-catalogs.adoc (or in the included file
it pulls in) so the xref resolves, or update the xref target here to the correct
existing anchor/path if the intended section uses a different anchor name.

Copy link
Contributor

@micheleRP micheleRP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kbatuigas kbatuigas merged commit de6d768 into main Nov 27, 2025
5 checks passed
@kbatuigas kbatuigas deleted the DOC-1306-cloud-bigquery branch November 27, 2025 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants