11# bio-data-to-db: make Uniprot PostgreSQL database
22
3+
34[ ![ image] ( https://img.shields.io/pypi/v/bio-data-to-db.svg )] ( https://pypi.python.org/pypi/bio-data-to-db )
45[ ![ PyPI - Downloads] ( https://img.shields.io/pypi/dm/bio-data-to-db )] ( https://pypi.python.org/pypi/bio-data-to-db )
56[ ![ image] ( https://img.shields.io/pypi/l/bio-data-to-db.svg )] ( https://pypi.python.org/pypi/bio-data-to-db )
@@ -19,6 +20,8 @@ Written in Rust, thus equipped with extremely fast parsers. Packaged for python,
1920
2021So far, there is only one function implemented: ** convert uniprot data to postgresql** . This package focuses more on parsing the data and inserting it into the database, rather than curating the data.
2122
23+ [ 📚 Documentation] ( https://deargen.github.io/bio-data-to-db/ )
24+
2225## 🛠️ Installation
2326
2427``` bash
@@ -29,6 +32,8 @@ pip install bio-data-to-db
2932
3033You can use the command line interface or the python API.
3134
35+ ### Uniprot
36+
3237``` bash
3338# It will create a db 'uniprot' and a table named 'public.uniprot_info' in the database.
3439# If you want another name, you can optionally pass it as the last argument.
@@ -61,6 +66,49 @@ create_accession_to_pk_id_table("postgresql://user:password@localhost:5432/unipr
6166keywords_tsv_to_postgresql(" ~/Downloads/keywords_all_2024_06_26.tsv" , " postgresql://user:password@localhost:5432/uniprot" )
6267```
6368
69+ ### BindingDB
70+
71+ ``` bash
72+ # Decode HTML entities and strip the strings in the `assay` table (column: description and assay_name).
73+ # Currently, only assay table is supported.
74+ bio-data-to-db bindingdb fix-table assay ' mysql://username:password@localhost/bind'
75+ ```
76+
77+ ``` python
78+ from bio_data_to_db.bindingdb.fix_tables import fix_assay_table
79+
80+ fix_assay_table(" mysql://username:password@localhost/bind" )
81+ ```
82+
83+ ### PostgreSQL Helpers, SMILES, Polars utils and more
84+
85+ ``` python
86+ Some useful functions to work with PostgreSQL.
87+
88+ ```python
89+ from bio_data_to_db.utils.postgresql import (
90+ create_db_if_not_exists,
91+ create_schema_if_not_exists,
92+ set_column_as_primary_key,
93+ make_columns_unique,
94+ make_large_columns_unique,
95+ split_column_str_to_list,
96+ polars_write_database,
97+ )
98+
99+ from bio_data_to_db.utils.smiles import (
100+ canonical_smiles_wo_salt,
101+ polars_canonical_smiles_wo_salt,
102+ )
103+
104+ from bio_data_to_db.utils.polars import (
105+ w_pbar,
106+ )
107+ ```
108+
109+ You can find the usage in the [ 📚 documentation] ( https://deargen.github.io/bio-data-to-db/ ) .
110+
111+
64112## 👨💻️ Maintenance Notes
65113
66114### Install from source
@@ -72,10 +120,14 @@ bash scripts/install.sh
72120uv pip install -r deps/requirements_dev.in
73121```
74122
75- ### Compile requirements (generate lockfiles)
123+ ### Generate lockfiles
76124
77125Use GitHub Actions: ` apply-pip-compile.yml ` . Manually launch the workflow and it will make a commit with the updated lockfiles.
78126
127+ ### Publish a new version to PyPI
128+
129+ Use GitHub Actions: ` deploy.yml ` . Manually launch the workflow and it will compile on all architectures and publish the new version to PyPI.
130+
79131### About sqlx
80132
81133Sqlx offline mode should be configured so you can compile the code without a database present.
0 commit comments