Tool suite for managing field-collected scientific datasets: staging and committing data in the field, duplicating across drives for redundancy, and pushing completed datasets to the E4E NAS.
Available as both a command-line tool (e4edm) and a cross-platform desktop GUI.
Download and install the latest e4edm release from the GitHub releases page.
Download the latest E4EDataManagement release for your platform (Windows, macOS, Linux) from the GitHub releases page. The GUI reads and writes the same configuration and dataset state as the CLI, so both tools can be used interchangeably.
e4edm init dataset --date DATE --project PROJECT --location LOCATION [--path DIRECTORY]
e4edm init mission --timestamp TIMESTAMP --device DEVICE --country COUNTRY
--region REGION --site SITE --name MISSION [--message NOTES]
e4edm status
e4edm activate DATASET [--day DAY] [--mission MISSION] [--root_dir ROOT_DIR]
e4edm add paths... [--readme] [--start START] [--end END] [--destination DESTINATION]
e4edm commit [--readme]
e4edm duplicate paths...
e4edm validate [root_dir]
e4edm push path
e4edm list dataset
e4edm list mission DATASET
e4edm prune
e4edm ls path
e4edm rm mission MISSION [--dataset DATASET]
e4edm config parameter [value]
e4edm reset
e4edm init dataset --date 2024-03-15 --project ReefLaser --location Palmyra
--date accepts an ISO 8601 date (YYYY-MM-DD) or the literal today.
--path overrides the default dataset root directory (configurable via e4edm config dataset_dir).
The dataset is created in a folder named YYYY.MM.DD.PROJECT.LOCATION and is automatically set as the active dataset.
e4edm status
Shows the active dataset, active mission, staged files, and committed file counts.
e4edm init mission --timestamp 2024-03-15T09:00:00-08:00 \
--device reef-laser-01 \
--country US --region Pacific --site "North Beach" \
--name "Morning Survey"
The mission is created inside the active dataset and is automatically set as the active mission. If the dataset already has location history, --country, --region, and --site default to the previously used values.
# Single file
e4edm add /path/to/video.mp4
# Multiple files or wildcards
e4edm add /data/raw/*.jpg
# Entire directory
e4edm add /data/raw/
# Time-filtered files
e4edm add --start 2024-03-15T09:00 --end 2024-03-15T10:00 /data/raw/
# Into a subdirectory within the mission
e4edm add --destination video /path/to/video.mp4
e4edm commit
Copies all staged files into the dataset directory and verifies the copy. Run e4edm status afterwards to confirm.
Every dataset requires a README (Markdown or .docx) at the dataset level before it can be pushed.
e4edm add --readme /path/to/README.md
e4edm commit --readme
e4edm push /Volumes/E4E-NAS/deployments
Validates all files and completeness criteria, then copies the dataset to the destination. The dataset is marked as pushed and will be removed by e4edm prune.
Remove datasets that have been pushed or whose directories no longer exist:
e4edm prune
Remove a specific mission:
e4edm rm mission "Morning Survey"
e4edm rm mission "Morning Survey" --dataset 2024.03.15.ReefLaser.Palmyra
Launch E4EDataManagement to open the GUI. It shares the same configuration directory as the CLI.
Toolbar buttons:
| Button | Action |
|---|---|
| New Dataset | Opens a form to create a new dataset (date, project, location, directory) |
| New Mission | Opens a form to create a mission in the active dataset (enabled when a dataset is active) |
| Add Files | Opens a file picker to stage files into the active mission (enabled when a mission is active) |
| Refresh | Reloads dataset and mission state from disk |
| Validate | Runs validation on the active dataset and displays any failures |
| Prune | Removes pushed and missing datasets |
| Push | Pushes the active dataset to the specified destination path |
The left panel lists all known datasets. A bold name indicates the active dataset; ✓ means it has been pushed. Clicking a dataset shows its missions and file counts in the detail pane.
Prerequisites: uv, Rust, .NET 10 SDK
# Python / CLI
uv sync --dev
uv run maturin develop # compiles the Rust extension for Python
uv run pytest # Python tests
uv run pylint e4e_data_management
# Desktop GUI
dotnet build E4EDataManagement.UI
dotnet test E4EDataManagement.Tests
# Rust (standalone, no Python)
cargo test --lib --no-default-features
cargo clippy --no-default-featuresThe Rust crate is built in two modes:
- With
pythonfeature (default) — compiled bymaturinas a Python extension module, providing the_coreC API used by the Python CLI. - Without
pythonfeature (--no-default-features) — compiled as a plaincdylibfor P/Invoke by the .NET desktop UI, with no Python ABI symbols.
YYYY.MM.DD.PROJECT.LOCATION/
ED-00/
Mission 1/
[data files]
manifest.json
metadata.json
Mission 2/
[data files]
manifest.json
metadata.json
ED-01/
Mission 3/
...
...
.e4edm.db
manifest.json
readme.md (or readme.docx)
.e4edm.db is the SQLite database used by the E4E Data Management tool to track staged and committed files, mission records, and dataset state. It is not part of the dataset and should not be removed, but it also should not be submitted or archived.
Each mission and the dataset root contains a manifest.json with SHA-256 hashes of all committed files, used to verify data integrity during push.
Each mission contains a metadata.json with the mission metadata: timestamp, device, country, region, site, mission name, and notes.
Tool configuration (active dataset, dataset directory, schema version) is stored in a SQLite database (config.db) at:
| Platform | Path |
|---|---|
| Linux / macOS | $XDG_CONFIG_HOME/E4EDataManagement/ (defaults to ~/.config/E4EDataManagement/) |
| Windows | %LOCALAPPDATA%\Engineers for Exploration\E4EDataManagement\ |