Conversation
WalkthroughThe documentation for the Redpanda Schema Registry was updated to include a new subsection under the "Normalization" section. This subsection, titled "Avro normalization," clarifies that Avro schemas are normalized using the Parsing Canonical Form as specified by the Avro standard, except for the omission of the STRIP transformation. No changes were made to code, logic, or control flow. Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (3)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
modules/manage/pages/schema-reg/schema-reg-overview.adoc (1)
26-29: Refine link versioning, acronym usage, and heading hierarchyTo further strengthen this section, consider:
- Introducing the acronym on first use: “Parsing Canonical Form (PCF)”.
- Replacing the
++version++placeholder in the URL with a defined attribute (e.g.,{avro-spec-version}) or hardcoding the exact spec version to guarantee the link resolves correctly.- Ensuring consistent nesting: convert the bold “Normalization” term into an AsciiDoc heading (e.g.,
=== Normalization) so that=== Avro normalization(or==== Avro normalization) is semantically and visually nested under it.Example diff for points 1 & 2:
=== Avro normalization -When normalizing an Avro schema, Redpanda transforms the schema into Parsing Canonical Form as defined in the https://avro.apache.org/docs/++version++/specification/#transforming-into-parsing-canonical-form[Avro specification], with the exception that it does not apply the STRIP transformation. +When normalizing an Avro schema, Redpanda transforms the schema into Parsing Canonical Form (PCF) as defined in the https://avro.apache.org/docs/{avro-spec-version}/specification/#transforming-into-parsing-canonical-form[Avro specification], with the exception that it does not apply the STRIP transformation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
modules/manage/pages/schema-reg/schema-reg-overview.adoc(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Redirect rules - redpanda-docs-preview
- GitHub Check: Header rules - redpanda-docs-preview
- GitHub Check: Pages changed - redpanda-docs-preview
🔇 Additional comments (1)
modules/manage/pages/schema-reg/schema-reg-overview.adoc (1)
26-29: Well done: clear "Avro normalization" subsectionThe new content succinctly explains that Redpanda normalizes Avro schemas to Parsing Canonical Form and explicitly omits the STRIP transformation, directly referencing the Avro spec. This aligns perfectly with the PR objective.
|
|
||
| === Avro normalization | ||
|
|
||
| When normalizing an Avro schema, Redpanda transforms the schema into Parsing Canonical Form as defined in the https://avro.apache.org/docs/++version++/specification/#transforming-into-parsing-canonical-form[Avro specification^], with the exception that it does not apply the STRIP transformation. |
There was a problem hiding this comment.
At the moment, the link versioning is not working, and so the link is broken.
asimms41
left a comment
There was a problem hiding this comment.
Just the broken link to fix.
|
|
||
| === Avro normalization | ||
|
|
||
| When normalizing an Avro schema, Redpanda transforms the schema into Parsing Canonical Form as defined in the https://avro.apache.org/docs/++version++/specification/#transforming-into-parsing-canonical-form[Avro specification^], with the exception that it does not apply the STRIP transformation. |
There was a problem hiding this comment.
Neither Redpanda or Confluent transform it into PCN.
[PRIMITIVES] Convert primitive schemas to their simple form (e.g., int instead of {"type":"int"}).
Yes, this is true
[FULLNAMES] Replace short names with fullnames, using applicable namespaces to do so. Then eliminate namespace attributes, which are now redundant.
All names are converted to fullnames, with redundant namespace included.
[STRIP] Keep only attributes that are relevant to parsing data, which are: type, name, fields, symbols, items, values, size. Strip all others (e.g., doc and aliases).
I think aliases are actually removed since the parser didn't support them (It's not intentional, I think a library update may support them now, which we should probably pull in).
[ORDER] Order the appearance of fields of JSON objects as follows: name, type, fields, symbols, items, values, size. For example, if an object has type, name, and size fields, then the name field should appear first, followed by the type and then the size fields.
Yes, order is fixed up.
[STRINGS] For all JSON string literals in the schema text, replace any escaped characters (e.g., \uXXXX escapes) with their UTF-8 equivalents.
Not sure about this.
[INTEGERS] Eliminate quotes around and any leading zeros in front of JSON integer literals (which appear in the size attributes of fixed schemas).
This is probable.
[WHITESPACE] Eliminate all whitespace in JSON outside of string literals.
Yes, this.
|
We will close this PR and mark https://redpandadata.atlassian.net/browse/DOC-43 as Will Not Implement so we can explore what and how we want to document with regards to our normalization strategy. https://redpandadata.atlassian.net/browse/DOC-1296 has been opened to track this work. |
Description
This pull request adds a new section to the Schema Registry documentation to explain how Avro normalization is handled in Redpanda. The update clarifies the transformation process and references the Avro specification.
Support for PCF was added in RP version 23.2.
Documentation updates:
modules/manage/pages/schema-reg/schema-reg-overview.adoc: Added a new subsection titled "Avro normalization" to describe how Redpanda transforms Avro schemas into Parsing Canonical Form, with a note that the STRIP transformation is not applied. A link to the relevant section of the Avro specification is included.Resolves https://redpandadata.atlassian.net/browse/
Review deadline: 29 Apr
Page previews
https://deploy-preview-1091--redpanda-docs-preview.netlify.app/current/manage/schema-reg/schema-reg-overview/#avro-normalization
Checks
Summary by CodeRabbit