Conversation
feat: remove pg_cron dependency and related scheduling logic for improved flexibility
…d water well summary
… and add tests for timestamp accuracy
…ized_view.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ions Jir feature collections
There was a problem hiding this comment.
Pull request overview
This PR expands the OGC API feature collections exposed via pygeoapi by adding new database-backed views/materialized views (latest TDS, depth-to-water trend, and water well summary), updates the pygeoapi OpenAPI generation to emit real paths, and removes the pg_cron dependency from local/CI database setup.
Changes:
- Added new OGC collections (
latest_tds_wells,depth_to_water_trend_wells,water_well_summary) backed by new SQL views/materialized views + updated pygeoapi config. - Updated pygeoapi OpenAPI generation to use pygeoapi’s document generator and added tests asserting OpenAPI has paths.
- Removed pg_cron installation/extension creation and updated CLI + tests to refresh the expanded set of materialized views.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_ogc.py | Adds OpenAPI + new collection endpoint tests; adds DB-backed tests for latest TDS date/value behavior. |
| tests/test_cli_commands.py | Updates expectations for refreshing 4 pygeoapi materialized views. |
| docker/db/Dockerfile | Removes pg_cron package installation from the DB image. |
| docker-compose.yml | Removes pg_cron preload command-line settings. |
| db/initialization.py | Stops attempting to create the pg_cron extension during schema recreation. |
| core/pygeoapi.py | Adds POSTGRES_* fallbacks for pygeoapi DB settings; tightens runtime config file permissions; generates OpenAPI from pygeoapi config. |
| core/pygeoapi-config.yml | Adds new pygeoapi resources/collections and tweaks titles. |
| core/initializers.py | Stops attempting to create pg_cron during local erase/rebuild. |
| cli/cli.py | Extends the list of materialized views refreshed by the CLI command. |
| alembic/versions/d5e6f7a8b9c0_create_pygeoapi_supporting_views.py | Removes pg_cron scheduling logic; adjusts OGC “thing” view columns; updates avg TDS date columns. |
| alembic/versions/i2b3c4d5e6f7_add_latest_tds_pygeoapi_materialized_view.py | Adds ogc_latest_tds_wells view and updates avg TDS matview to use analysis/collection date fallback. |
| alembic/versions/k4d5e6f7a8b9_add_depth_to_water_trend_materialized_view.py | Adds ogc_depth_to_water_trend_wells materialized view + unique index. |
| alembic/versions/l5e6f7a8b9c0_add_water_well_summary_materialized_view.py | Adds ogc_water_well_summary materialized view + unique index. |
| .github/workflows/tests.yml | Removes pg_cron from CI DB setup/extension creation steps. |
Comments suppressed due to low confidence (1)
core/pygeoapi.py:296
- _pygeoapi_db_settings() now falls back to POSTGRES_HOST/PORT/DB/USER, but it still requires PYGEOAPI_POSTGRES_PASSWORD to be set. This is inconsistent with the new fallback behavior and will still fail in environments that only provide POSTGRES_PASSWORD (e.g., docker-compose.yml currently sets POSTGRES_PASSWORD but not PYGEOAPI_POSTGRES_PASSWORD). Consider allowing a POSTGRES_PASSWORD fallback as well, and emitting the matching ${...} placeholder in the generated config.
user = (os.environ.get("PYGEOAPI_POSTGRES_USER") or "").strip() or (
os.environ.get("POSTGRES_USER") or ""
).strip()
if not user:
raise RuntimeError(
"PYGEOAPI_POSTGRES_USER or POSTGRES_USER must be set and non-empty "
"to generate the pygeoapi configuration."
)
if os.environ.get("PYGEOAPI_POSTGRES_PASSWORD") is None:
raise RuntimeError(
"PYGEOAPI_POSTGRES_PASSWORD must be set to "
"generate the pygeoapi configuration."
)
return host, port, dbname, user, "${PYGEOAPI_POSTGRES_PASSWORD}"
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | ||
| session.commit() | ||
|
|
||
| latest_dt = session.execute( | ||
| text( | ||
| "SELECT latest_tds_observation_date " | ||
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | ||
| ), | ||
| {"thing_id": water_well_thing.id}, | ||
| ).scalar_one() | ||
| avg_range = session.execute( | ||
| text( | ||
| "SELECT first_tds_observation_date, last_tds_observation_date " | ||
| "FROM ogc_avg_tds_wells WHERE id = :thing_id" | ||
| ), | ||
| {"thing_id": water_well_thing.id}, | ||
| ).one() | ||
|
|
||
| assert latest_dt is not None | ||
| assert latest_dt.isoformat() == "2024-01-15" | ||
| assert avg_range.first_tds_observation_date is not None | ||
| assert avg_range.last_tds_observation_date is not None | ||
| assert avg_range.first_tds_observation_date.isoformat() == "2024-01-15" | ||
| assert avg_range.last_tds_observation_date.isoformat() == "2024-01-15" | ||
|
|
||
| session.delete(mc) | ||
| session.delete(csi) | ||
| session.commit() | ||
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | ||
| session.commit() | ||
|
|
||
|
|
There was a problem hiding this comment.
These tests manually insert rows and refresh a shared materialized view, but cleanup happens only after the assertions. Since the test suite shares DB state across tests (see tests/conftest.py), a failure before cleanup will leave extra rows and a refreshed matview behind, potentially breaking later tests. Wrap the insert/refresh/assert block in a try/finally (or use a fixture finalizer) so the deletes + matview refresh always run.
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() | |
| latest_dt = session.execute( | |
| text( | |
| "SELECT latest_tds_observation_date " | |
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).scalar_one() | |
| avg_range = session.execute( | |
| text( | |
| "SELECT first_tds_observation_date, last_tds_observation_date " | |
| "FROM ogc_avg_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).one() | |
| assert latest_dt is not None | |
| assert latest_dt.isoformat() == "2024-01-15" | |
| assert avg_range.first_tds_observation_date is not None | |
| assert avg_range.last_tds_observation_date is not None | |
| assert avg_range.first_tds_observation_date.isoformat() == "2024-01-15" | |
| assert avg_range.last_tds_observation_date.isoformat() == "2024-01-15" | |
| session.delete(mc) | |
| session.delete(csi) | |
| session.commit() | |
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() | |
| try: | |
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() | |
| latest_dt = session.execute( | |
| text( | |
| "SELECT latest_tds_observation_date " | |
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).scalar_one() | |
| avg_range = session.execute( | |
| text( | |
| "SELECT first_tds_observation_date, last_tds_observation_date " | |
| "FROM ogc_avg_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).one() | |
| assert latest_dt is not None | |
| assert latest_dt.isoformat() == "2024-01-15" | |
| assert avg_range.first_tds_observation_date is not None | |
| assert avg_range.last_tds_observation_date is not None | |
| assert avg_range.first_tds_observation_date.isoformat() == "2024-01-15" | |
| assert avg_range.last_tds_observation_date.isoformat() == "2024-01-15" | |
| finally: | |
| session.delete(mc) | |
| session.delete(csi) | |
| session.commit() | |
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() |
| row = session.execute( | ||
| text( | ||
| "SELECT latest_tds_observation_date, latest_tds_value " | ||
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | ||
| ), | ||
| {"thing_id": water_well_thing.id}, | ||
| ).one() | ||
|
|
||
| assert row.latest_tds_observation_date.isoformat() == "2024-02-01" | ||
| assert float(row.latest_tds_value) == 700.0 | ||
|
|
||
| session.delete(mc_late) | ||
| session.delete(mc_early) | ||
| session.delete(csi) | ||
| session.commit() | ||
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | ||
| session.commit() | ||
|
|
||
|
|
There was a problem hiding this comment.
Same issue as the prior test: cleanup (deleting inserted rows and refreshing ogc_avg_tds_wells) only runs after assertions. If an assertion fails, the DB state/matview contents will be left dirty for subsequent tests that share the same database. Use try/finally or a fixture finalizer to guarantee teardown and matview refresh.
| row = session.execute( | |
| text( | |
| "SELECT latest_tds_observation_date, latest_tds_value " | |
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).one() | |
| assert row.latest_tds_observation_date.isoformat() == "2024-02-01" | |
| assert float(row.latest_tds_value) == 700.0 | |
| session.delete(mc_late) | |
| session.delete(mc_early) | |
| session.delete(csi) | |
| session.commit() | |
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() | |
| try: | |
| row = session.execute( | |
| text( | |
| "SELECT latest_tds_observation_date, latest_tds_value " | |
| "FROM ogc_latest_tds_wells WHERE id = :thing_id" | |
| ), | |
| {"thing_id": water_well_thing.id}, | |
| ).one() | |
| assert row.latest_tds_observation_date.isoformat() == "2024-02-01" | |
| assert float(row.latest_tds_value) == 700.0 | |
| finally: | |
| session.delete(mc_late) | |
| session.delete(mc_early) | |
| session.delete(csi) | |
| session.commit() | |
| session.execute(text("REFRESH MATERIALIZED VIEW ogc_avg_tds_wells")) | |
| session.commit() |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0d51464858
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| t.formation_completion_code, | ||
| t.nma_formation_zone, | ||
| t.release_status, | ||
| l.elevation, |
There was a problem hiding this comment.
Move view-definition changes into a new Alembic revision
This commit changes the SQL projection in revision d5e6f7a8b9c0 (for example adding l.elevation here and removing prior columns), but Alembic will never re-run a revision that is already applied. Environments that had already migrated to d5e6f7a8b9c0 keep the old ogc_* thing-view schemas, while fresh installs get the new schema, and later revisions in this commit only recreate TDS/trend/summary views. That creates divergent schemas at the same head revision and inconsistent OGC collection payloads across deployments.
Useful? React with 👍 / 👎.
| @@ -360,21 +285,9 @@ def upgrade() -> None: | |||
| _create_matview_indexes() | |||
|
|
|||
| op.execute(text(_create_refresh_function())) | |||
There was a problem hiding this comment.
Restore automatic refresh for pygeoapi materialized views
After this change, the migration still creates refresh_pygeoapi_materialized_views() but no longer schedules or otherwise invokes it automatically, so ogc_* materialized views stop updating after creation unless operators run the CLI command manually. In deployments that previously relied on built-in nightly refresh, OGC endpoints will serve stale depth/TDS/trend/summary data as source tables change.
Useful? React with 👍 / 👎.
Why
This PR addresses the following problem / context:
How
Implementation summary - the following was changed / added / removed:
Notes
Any special considerations, workarounds, or follow-up work to note?