Skip to content

Checking Table Health

Snowpack exposes two endpoints for inspecting the health of an Iceberg table. Use the live endpoint when you need a real-time assessment. Use the cached endpoint when you need a fast, low-cost lookup — for example in dashboards or automated checks.

Live health check

GET /tables/{database}/{table}/health

This endpoint loads the table directly from the PyIceberg catalog (backed by AWS Glue and S3), analyzes its metadata, and returns a full health report. The call typically takes a few seconds depending on the size of the table’s metadata.

Terminal window
curl -s https://<snowpack-host>/tables/offer_service/offers/health | jq .

The live endpoint also saves the result as a health snapshot in Postgres, so subsequent cached lookups reflect the latest state.

Cached health check

GET /tables/{database}/{table}/health/cached

This endpoint returns the most recent health snapshot from Postgres. It is near-instant (~1ms) because it performs a single database read with no catalog or S3 interaction. Health snapshots are collected automatically by the health-sync CronJob every 15 minutes.

Terminal window
curl -s https://<snowpack-host>/tables/offer_service/offers/health/cached | jq .

Error responses:

  • 404 — No health snapshot exists for this table. This happens when the table has not yet been assessed by health-sync, or when it is not included in healthSync.databases in the Helm values.
  • 503 — Postgres is unavailable. The API cannot serve cached data when the database connection is down.

Response format

Both endpoints return a JSON object with the same shape. Here is an annotated example:

{
"table": {
"database": "offer_service",
"table_name": "offers",
"location": "s3://lakehouse-dev/offer_service/offers",
"format_version": 2
},
"file_stats": {
"total_data_files": 1284,
"total_delete_files": 12,
"total_size_bytes": 8741625856,
"avg_file_size_bytes": 6807340,
"small_file_count": 847,
"small_file_pct": 65.97
},
"snapshot_count": 142,
"manifest_count": 38,
"total_records": 52491000,
"snapshot_id": 7289345612038475000,
"hours_since_last_snapshot": 1.2,
"oldest_snapshot_age_hours": 720.5,
"health_status": "Unhealthy",
"needs_maintenance": true,
"maintenance_enabled": true,
"maintenance_cadence_hours": 6,
"recommended_actions": [
"rewrite_data_files",
"expire_snapshots",
"rewrite_manifests"
],
"snowpack_config": {
"maintenance_enabled": "true",
"maintenance_cadence_hours": "6"
},
"error": null
}

Field reference

file_stats — Statistics about the table’s data files:

FieldDescription
total_data_filesTotal number of data files across all partitions.
total_delete_filesNumber of position-delete files pending compaction.
total_size_bytesCombined size of all data files in bytes.
avg_file_size_bytesMean data file size. Iceberg targets ~128 MB per file; values far below this indicate a small-file problem.
small_file_countNumber of data files smaller than the small-file threshold (default: 32 MB).
small_file_pctPercentage of data files classified as small. A high percentage (above ~30%) typically triggers a rewrite_data_files recommendation.

Top-level fields:

FieldDescription
snapshot_countNumber of Iceberg snapshots retained. Excess snapshots inflate metadata size and slow planning.
manifest_countNumber of manifest files. Redundant manifests slow file planning in queries.
total_recordsApproximate row count across all data files.
snapshot_idThe current snapshot ID of the table.
hours_since_last_snapshotHours elapsed since the most recent snapshot was created.
oldest_snapshot_age_hoursAge of the oldest retained snapshot in hours.
health_statusSummary label: Healthy, Unhealthy, or Unknown.
needs_maintenancetrue when at least one threshold is breached and maintenance is warranted.
maintenance_enabledWhether the table has opted into automated maintenance via the snowpack.maintenance_enabled table property. null if the property is not set.
maintenance_cadence_hoursPer-table cadence override, or null if using the cluster default.
recommended_actionsOrdered list of maintenance actions Snowpack recommends, based on current thresholds.
snowpack_configAll snowpack.* table properties with the prefix stripped. Useful for debugging opt-in status and overrides.
errornull on success. Contains an error message if the health analysis failed for this table.