Commands to download the DBpedia Knowledge Graphs generated by Live Fusion. DBpedia Live Fusion publishes two different kinds of KGs:
- Open Core Knowledge Graphs under CC-BY-SA license, open with copyleft/share-alike, no registration needed
- Industry Knowledge Graphs under BUSL 1.1 license, unrestricted for research and experimentation, commercial license for productive use, free registration needed.
- If you do not have a DBpedia Account yet (Forum/Databus), please register at https://account.dbpedia.org
- Login at https://account.dbpedia.org and create your token.
- Save the token to a file
vault-token.dat.
The databus-python-client comes as docker or python with these patterns.
$DOWNLOADTARGET can be any Databus URI including collections OR SPARQL query (or several thereof). Details are documented below.
# Docker
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download $DOWNLOADTARGET --token vault-token.dat
# Python
python3 -m pip install databusclient
databusclient download $DOWNLOADTARGET --token vault-token.datTODO One slogan sentence. More information
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-snapshot --token vault-token.datDBpedia Wikipedia Extraction Enriched TODO One slogan sentence and link Currently EN DBpedia only.
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-snapshot --token vault-token.datDBpedia Wikidata Extraction Enriched TODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikidata-kg-enriched-snapshot --token vault-token.datTODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-snapshot TODO One slogan sentence and link
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikidata-kg-snapshot A docker image is available at dbpedia/databus-python-client. See download section for details.
Installation
python3 -m pip install databusclientRunning
databusclient --helpUsage: databusclient [OPTIONS] COMMAND [ARGS]...
Options:
--install-completion [bash|zsh|fish|powershell|pwsh]
Install completion for the specified shell.
--show-completion [bash|zsh|fish|powershell|pwsh]
Show completion for the specified shell, to
copy it or customize the installation.
--help Show this message and exit.
Commands:
deploy
downloaddatabusclient download --help
Usage: databusclient download [OPTIONS] DATABUSURIS...
Arguments:
DATABUSURIS... databus uris to download from https://databus.dbpedia.org,
or a query statement that returns databus uris from https://databus.dbpedia.org/sparql
to be downloaded [required]
Download datasets from databus, optionally using vault access if vault
options are provided.
Options:
--localdir TEXT Local databus folder (if not given, databus folder
structure is created in current working directory)
--databus TEXT Databus URL (if not given, inferred from databusuri, e.g.
https://databus.dbpedia.org/sparql)
--token TEXT Path to Vault refresh token file
--authurl TEXT Keycloak token endpoint URL [default:
https://auth.dbpedia.org/realms/dbpedia/protocol/openid-
connect/token]
--clientid TEXT Client ID for token exchange [default: vault-token-
exchange]
--help Show this message and exit. Show this message and exit.
Examples of using download command
File: download of a single file
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2
Version: download of all files of a specific version
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
Artifact: download of all files with latest version of an artifact
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals
Group: download of all files with lates version of all artifacts of a group
databusclient download https://databus.dbpedia.org/dbpedia/mappings
If no --localdir is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the databus structure, i.e. ./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/.
Collection: download of all files within a collection
databusclient download https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12
Query: download of all files returned by a query (sparql endpoint must be provided with --databus)
databusclient download 'PREFIX dcat: <http://www.w3.org/ns/dcat#> SELECT ?x WHERE { ?sub dcat:downloadURL ?x . } LIMIT 10' --databus https://databus.dbpedia.org/sparql
databusclient deploy --help
Usage: databusclient deploy [OPTIONS] [DISTRIBUTIONS]...
Flexible deploy to databus command:
- Classic dataset deployment
- Metadata-based deployment
- Upload & deploy via Nextcloud
Arguments:
DISTRIBUTIONS... Depending on mode:
- Classic mode: List of distributions in the form
URL|CV|fileext|compression|sha256sum:contentlength
(where URL is the download URL and CV the key=value pairs,
separated by underscores)
- Upload mode: List of local file or folder paths (must exist)
- Metdata mode: None
Options:
--version-id TEXT Target databus version/dataset identifier of the form <h
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
RSION> [required]
--title TEXT Dataset title [required]
--abstract TEXT Dataset abstract max 200 chars [required]
--description TEXT Dataset description [required]
--license TEXT License (see dalicc.net) [required]
--apikey TEXT API key [required]
--metadata PATH Path to metadata JSON file (for metadata mode)
--webdav-url TEXT WebDAV URL (e.g.,
https://cloud.example.com/remote.php/webdav)
--remote TEXT rclone remote name (e.g., 'nextcloud')
--path TEXT Remote path on Nextcloud (e.g., 'datasets/mydataset')
--help Show this message and exit.
databusclient deploy --version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
databusclient deploy --version-id https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
A few more notes for CLI usage:
- The content variants can be left out ONLY IF there is just one distribution
- For complete inferred: Just use the URL with
https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml - If other parameters are used, you need to leave them empty like
https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml||yml|7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653:367116
- For complete inferred: Just use the URL with
Use a JSON metadata file to define all distributions. The metadata.json should list all distributions and their metadata. All files referenced there will be registered on the Databus.
databusclient deploy \
--metadata /home/metadata.json \
--version-id https://databus.org/user/dataset/version/1.0 \
--title "Metadata Deploy Example" \
--abstract "This is a short abstract of the dataset." \
--description "This dataset was uploaded using metadata.json." \
--license https://dalicc.net/licenselibrary/Apache-2.0 \
--apikey "API-KEY"Metadata file structure (file_format and compression are optional):
[
{
"checksum": "0929436d44bba110fc7578c138ed770ae9f548e195d19c2f00d813cca24b9f39",
"size": 12345,
"url": "https://cloud.example.com/remote.php/webdav/datasets/mydataset/example.ttl",
"file_format": "ttl"
},
{
"checksum": "2238acdd7cf6bc8d9c9963a9f6014051c754bf8a04aacc5cb10448e2da72c537",
"size": 54321,
"url": "https://cloud.example.com/remote.php/webdav/datasets/mydataset/example.csv.gz",
"file_format": "csv",
"compression": "gz"
}
]
Upload local files or folders to a WebDAV/Nextcloud instance and automatically deploy to DBpedia Databus. Rclone is required.
databusclient deploy \
--webdav-url https://cloud.example.com/remote.php/webdav \
--remote nextcloud \
--path datasets/mydataset \
--version-id https://databus.org/user/dataset/version/1.0 \
--title "Test Dataset" \
--abstract "Short abstract of dataset" \
--description "This dataset was uploaded for testing the Nextcloud → Databus pipeline." \
--license https://dalicc.net/licenselibrary/Apache-2.0 \
--apikey "API-KEY" \
./localfile1.ttl \
./data_folderFor downloading files from the vault, you need to provide a vault token. See getting-the-access-refresh-token for details. You can come back here once you have a vault-token.dat file. To use it, just provide the path to the file with --token /path/to/vault-token.dat.
Example:
databusclient download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23 --token vault-token.dat
If vault authentication is required for downloading a file, the client will use the token. If no vault authentication is required, the token will not be used.
A docker image is available at dbpedia/databus-python-client. You can use it like this:
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
If using vault authentication, make sure the token file is available in the container, e.g. by placing it in the current working directory.
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23/fusion_props=all_subjectns=commons-wikimedia-org_vocab=all.ttl.gz --token vault-token.dat
from databusclient import create_distribution
# create a list
distributions = []
# minimal requirements
# compression and filetype will be inferred from the path
# this will trigger the download of the file to evaluate the shasum and content length
distributions.append(
create_distribution(url="https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml", cvs={"type": "swagger"})
)
# full parameters
# will just place parameters correctly, nothing will be downloaded or inferred
distributions.append(
create_distribution(
url="https://example.org/some/random/file.csv.bz2",
cvs={"type": "example", "realfile": "false"},
file_format="csv",
compression="bz2",
sha256_length_tuple=("7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653", 367116)
)
)A few notes:
- The dict for content variants can be empty ONLY IF there is just one distribution
- There can be no compression if there is no file format
from databusclient import create_dataset
# minimal way
dataset = create_dataset(
version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
title="Client Testing",
abstract="Testing the client....",
description="Testing the client....",
license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
distributions=distributions,
)
# with group metadata
dataset = create_dataset(
version_id="https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18",
title="Client Testing",
abstract="Testing the client....",
description="Testing the client....",
license_url="http://dalicc.net/licenselibrary/AdaptivePublicLicense10",
distributions=distributions,
group_title="Title of group1",
group_abstract="Abstract of group1",
group_description="Description of group1"
)NOTE: To be used you need to set all group parameters, or it will be ignored
from databusclient import deploy
# to deploy something you just need the dataset from the previous step and an APIO key
# API key can be found (or generated) at https://$$DATABUS_BASE$$/$$USER$$#settings
deploy(dataset, "mysterious api key")