⚠ Archived repository: this repository is now archive. It has been superseded by Incremental Maven Crawler.
This is a tool for crawling Maven repositories and gathering Maven coordinates. It can be used for research and education purposes.
- Python 3.5 or newer
- Apache Kafka (optional)
pip install mvncrawler
mvncrawler --p ./maven/ --q q_items.txt --t "fasten.mvn.pkg" --c 5 --l 10
It extracts 10 Maven coordinates.
- Use
--helpoption to see the description of each arguments. - If you do not have a Kafka server on your machine, add
--no-kafkaoption to the tool for saving Maven coordinates in a file. - You can remove
--l 10option to extract Maven coordinates without a limit.
Extracted Maven coordinates are converted to a JSON-compatible string as shown and described below:
{"groupId": "com.yahoo.vespa", "artifactId": "zookeeper-server-common", "version": "7.171.10", "date": "1580860140", "url": "https://repo1.maven.org/maven2/com/yahoo/vespa/zookeeper-server-common/7.171.10/zookeeper-server-common-7.171.10.pom"}
groupId: The specified groupID in a POM file.artifactId: The specified artifactID in a POM file.version: The version of a Maven package as specified in its POM file.date: The release date of a Maven package in Unix epoch format.url: The URL of a POM file on the Maven server.
We are NOT responsible for any damage or the misuse of this tool.