Skip to content

MySQL support for sync-indexes #67

@koenvo

Description

@koenvo

MySQL support for sync-indexes

Background

sync-indexes currently creates partial expression indexes on PostgreSQL for high-cardinality identifier keys (e.g. one dataset per keyword). MySQL is a no-op.

Approach

MySQL 8.0.13+ supports functional indexes directly — no generated columns needed:

CREATE INDEX idx_dataset_identifier_keyword_ads_keyword_metrics
ON dataset (provider, dataset_type, (JSON_VALUE(identifier, '$.keyword' RETURNING CHAR(255))))
WHERE ...;  -- MySQL doesn't support partial indexes, so provider/dataset_type must be index columns

Note: MySQL has no partial index support, so provider and dataset_type must be leading columns in the index rather than a WHERE predicate.

For older MySQL (< 8.0.13), generated stored columns are the only option:

ALTER TABLE dataset
  ADD COLUMN keyword_txt VARCHAR(255)
    GENERATED ALWAYS AS (JSON_VALUE(identifier, '$.keyword' RETURNING CHAR(255))) STORED;

CREATE INDEX idx_dataset_identifier_keyword_ads_keyword_metrics
ON dataset (provider, dataset_type, keyword_txt);

This is more invasive (schema mutation, harder to roll back) and is not recommended unless targeting pre-8.0.13.

Scope

  • Detect MySQL dialect in create_identifier_indexes
  • Generate functional index SQL for MySQL 8.0.13+
  • Include provider and dataset_type as leading index columns (no partial index support)
  • Cast based on key_type: RETURNING UNSIGNED for int, RETURNING CHAR(255) for str
  • Skip (no-op + log warning) for MySQL < 8.0.13

Out of scope

  • Generated column approach for older MySQL versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions