Open development of genomic language models — data, modeling, and evaluation.
Development is driven by experiments tracked as GitHub issues.
| Experiment | Status |
|---|---|
| #21 Promoters YOLO run | Closed - matches Evo 2 on promoter VEP but still behind GPN-Star |
| #22 mRNA + promoters YOLO run | Closed - combined model consumed by coding regions; poor on promoter variants |
| #27 CDS YOLO run | Closed - matches Evo 2 on missense variants but falls behind GPN-Star |
| Experiment | Status |
|---|---|
| #41 Promoters from mRNA vs. ncRNA | Closed - adding ncRNA promoters shows no significant difference in VEP performance |
| #13 Mixing different genomic regions | Closed - balanced mixing gives balanced performance; proportional mixing dominated by CDS |
| #53 Alternative datasets based on distance from CDS | Closed - distance-based heuristic (a la SpeciesLM) instead of UTR annotations |
| #9 Repeat downweighting | Closed - downweighting repetitive elements improves VEP and stabilizes training |
| #42 Promoter radius | Closed - smaller radius performs better; expanding to ±2kb degrades performance |
| #43 Mixing 5 different regions | Closed - CDS, promoters, and 5' UTR learn well; 3' UTR and ncRNA show limited improvement |
| Experiment | Status |
|---|---|
| #55 Promoters from different evolutionary timescales | Closed - mammals-trained model reaches good VEP performance fastest |
| #58 CDS from different evolutionary timescales | Closed - longer timescales (animals) perform better for missense variants |
| #59 Downstream regions from different evolutionary timescales | Closed - mammals trains fastest but all timescales converge with sufficient training |
| Experiment | Status |
|---|---|
| #3 Different training objectives | Closed - CLM appears to do better than MLM at initial steps |
| Experiment | Status |
|---|---|
| #37 Context size | Closed - 256bp and 512bp contexts perform similarly on VEP |
| Experiment | Status |
|---|---|
| #57 Scaling on a mixture dataset | Open |
| Experiment | Status |
|---|---|
| #8 Understand relationship between perplexity and other metrics | Open |
uv sync# Install dev dependencies and pre-commit hooks
uv sync --group dev
uv run pre-commit install
# Run quality checks (ruff format/lint, snakefmt)
uv run pre-commit run
# Run tests
uv run pytestsrc/bolinas/- Main Python packagedata/- Genomic data structures (GenomicSet, etc.)evals/- Evaluation utilities (inference, metrics, plotting)
snakemake/- Snakemake workflowstraining_dataset/- Creates genomic training datasets from NCBI RefSeq genomesevals/- Downloads and processes evaluation datasetsanalysis/evals_v1/- Evaluates trained models on variant effect prediction tasks
tests/- Test suite
