omilayers is a Python data management library. It is suitable for multi-omic data analysis, hence the omi prefix, that involves the handling of diverse datasets usually referred to as omic layers. omilayers wraps the APIs of SQLite and DuckDB and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.
The rationale behind omilayers is the following:
- User stores layers of omic data (tables in SQL lingo).
- User creates new layers by processing and restructuring existing layers.
- User can group layers using tags.
- User can store a brief description for each layer.
Although SQL is a straightfoward language, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of omilayers is to provide this level of abstaction to facilitate bioinformatic data analysis. The omilayers API resembles the pandas API and the user needs to write the following code to parse a column named foo from a layer called omicdata:
with DuckDB (default database)
from omilayers import Omilayers
omi = Omilayers("dbname.duckdb")
result = omi.layers['omicdata']['foo']with SQLite
from omilayers import Omilayers
omi = Omilayers("dbname.sqlite", engine="sqlite")
result = omi.layers['omicdata']['foo']pip install omilayers
The directory testing includes predefined unittests for SQLite and DuckDB.
To test the functionality of omilayers with SQLite:
python -m unittests -v tests_sqlite.pyTo test the functionality of omilayers with DuckDB:
python -m unittests -v tests_duckdb.pyThe directory synthetic_data includes two jupyter notebooks (one for SQLite and one for DuckDB) for testing omilayers using synthetic multi-omic data. It also includes the Python script create_synthetic_vcf/synthesize_vcf.py that was used to create the synthetic VCF that is hosted in Zenodo .
The recreation of the synthetic VCF can be done as following:
for i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;doneTo join the generated VCFs into a single VCF:
for i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;doneYou can read the full documentation here: https://omilayers.readthedocs.io
To cite omilayers in your study you may include the following citation:
Kioroglou D. Omilayers: a Python package for efficient data management to support multi-omic analysis. BMC bioinformatics. 2025 Feb 6;26:40.
Bibtex entry:
@article{kioroglou2025omilayers,
title={Omilayers: a Python package for efficient data management to support multi-omic analysis},
author={Kioroglou, Dimitrios},
journal={BMC bioinformatics},
volume={26},
pages={40},
year={2025}
}