Add machine-readable constraint definitions (#22)#27
Add machine-readable constraint definitions (#22)#27Abhishek-Kumar-Rai5 wants to merge 1 commit intoPecanProject:mainfrom
Conversation
|
Hi @dlebauer as you had instructed and keeping the suggestion in mind of using both the frictionless-data and defining the rest in a separate validation layer, I have made all the additions and changes. Please have a look whenever you have a second. Further along with it I have also addressed the parent issue #14 and its PR is ready from my side which will address all the issues you have given in the description and the comment there. Ready to push that Pr whenever you give the go signal to raise it. |
63672e1 to
600b6b0
Compare
There was a problem hiding this comment.
This is a great start, but it does not appear to cover the set of constraints defined in #14 (more specifically, documentation, spreadsheet, and sql schema definition therein?).
While it is not required to migrate all constraints, for any constraints that don't need to be migrated, please make a note why that don't need to be migrated (you can even copy the spreadsheet and keep notes there). And if there are constraints where it is not clear that they need to be migrated, please make a note in the spreadsheet or ask.
Note: 1. constraints on/from tables that haven't been migrated are excluded. 2. The set of constraints is comprehensive, and there may be some that do not need to be migrated, but we should discuss rather than silently dropping.
| lat, lon, and masl must all be specified together or not at all. | ||
| Enforced indirectly via the geometry column in the CSV representation. | ||
| Flagged here for documentation; enforcement is via geometry parsing logic. | ||
| message: "lat, lon, and masl must all be present together (source: site.rb complete_geometry_specification)" |
There was a problem hiding this comment.
no need to keep the source of the message here (or below, sufficient to have this in file metadata header
| # Cross-table lookup rules that require joining against another table. | ||
| # These were originally implemented as PostgreSQL triggers/functions. | ||
| # | ||
| # Sources: |
There was a problem hiding this comment.
from #14 see also
- Documentation Constraints for BETYdb.
- Enumerated in the Constraints Spreadsheet
- Implemented in the postgres schema structure db/structure.sql in the bety repository.
|
@dlebauer sir, apologies for the confusion that got created. The scope here was limited to:
I just wanted to keep the PR separate so that they are clean and can be reviewed easily as the changes were large. And also based on this I wanted a go signal in case my current approach was correct and if it was not then I would have taken the suggestions and then raised that PR for an easy merger. |
|
I have created the PR as it was already ready with me. Please have a look at it whenever you have a second. The PR is #28 |
Description
Implement machine-readable constraint definitions using Frictionless Data Table Schema (datapackage.json) and custom YAML for constraints that Frictionless cannot express natively.
This establishes the single source of truth for all BETYdb validation rules, enabling external tools to validate data before submission and eliminating the need to reverse-engineer rules from scattered Rails/SQL implementations.
Related Issue(s)
Closes #22 (Implement machine-readable source of truth for data constraints and model validation rules)
Related to #14 (Implement constraints and validation from postgres BETYdb)
Type of Change
Changes Made
Layer 1: Frictionless Data (datapackage.json)
required: trueon critical fields (sitename, trait, mean, id, name, units)minimum/maximumon numeric fields (lat: -90/90, lon: -180/180, masl, n ≥ 1)enumon restricted values (access_level: 1-4, statname, checked)uniqueandprimaryKeyon uniqueness constraintsforeignKeyson referential integrity across all tablesLayer 2: Custom YAML (data-raw/custom_constraints.yaml, inst/extdata/)
Checklist
devtools::check()with no errors or warnings (will verify in PR Build MVP datasets: CSV sources in data-raw → exported data objects + Parquet #2 after validation layer added)Data Changes (if applicable)
Additional Notes
What This Enables
What PR 2nd Will Add
How to Review