Here we describe some additional useful commands to handle IYP dumps.
If you setup the database you can load a new dump without
recreating the Docker containers. Place the new dump at dumps/neo4j.dump, delete the
existing database and run only the loader again:
# If the database is running, stop it.
# docker stop iyp
# Delete the existing database
rm -r data/*
# Run the loader
docker start -i iyp_loader
# Start the database.
docker start iypIf you did changes to the database and want to dump the contents into a file, you can
use the loader for this. For example, to dump the database into a folder called
backups:
# Directory has to exist or it will be created as root by Docker.
mkdir -p backups
uid="$(id -u)" gid="$(id -g)" docker compose run --rm -i -v "$PWD/backups:/backups" iyp_loader neo4j-admin database dump neo4j --to-path=/backups --verbose --overwrite-destinationThis will create a file called neo4j.dump in the backups folder. Note that this
will also overwrite this file if it exists!
To view the logs of the Neo4j container, use the following command:
docker logs -f iypEnabling all crawlers will download a lot of data and take multiple days to create a dump.
Clone this repository:
git clone https://github.com/InternetHealthReport/internet-yellow-pages.git
cd internet-yellow-pagesCreate Python environment and install Python libraries:
python3 -m venv --upgrade-deps .venv
source .venv/bin/activate
pip install -r requirements.txtCreate a configuration file from the example file and add API keys. Note that some crawlers do not work without credentials.
cp config.json.example config.json
# Edit as neededCreate and populate a new database:
python3 create_db.pySee the Neo4j documentation on how to add authentication using Docker secrets and how to add SSL encryption.