-
Notifications
You must be signed in to change notification settings - Fork 0
Description
I am pasting here the documentation from CRM-6335, in case there is no other copy (passwords omitted).
We now have a central database for OC4IDS data and a set of queries to assess data quality.
To load data into the database, use the OC4IDS Database - Import Data notebook. The database follows similar conventions to Kingfisher Process and Views, but with fewer features. It has the following tables:
- collection - equivalent to the collection table in Kingfisher Process with one row per collection.
- projects - equivalent to the join of the release and data tables in Kingfisher Process with one row per project.
- collection_check - similar to release_check in Kingfisher Process, but with one row per collection.
- field_counts - equivalent to field_counts in Kingfisher Views
- oc4ids_schema - a flattened version of the OC4IDS schema, for use in coverage queries
To analyze data and prepare feedback, use the OC4IDS Data Feedback Notebook. The queries in the notebook cover:
- Scope
- Structure
- Format
- Conformance
- Coherence
- Basic usability checks.
To query the database, use the following details for the read-only user:
Host: oc4ids-database-2.cuujgua4wses.us-east-1.rds.amazonaws.com
Port: 5432 (default)
User: readonly
Things to note:
- The Data Feedback Notebook is a work in progress, some queries require testing and documentation is needed for most queries.
- Don't rely on this database for anything mission-critical, it's a hack rather than a proper piece of software, and we don't have any backups in place.
- The postgres used in the import data notebook has full permissions, use the following connection details if you need those:
Host: oc4ids-database-2.cuujgua4wses.us-east-1.rds.amazonaws.com
Port: 5432 (default)
User: postgres
In the event we need to build a fresh copy of the database on a new server, I've documented the steps in the OC4IDS Database - Setup notebook.