Support Links in Jupyter Notebooks?

The Jupyiter notebook file format supports Markdown cells, which can contain links. Currently, we extract links using our plaintext extractor, which can lead to false-positives. For instance, see the discussion here: https://github.com/lycheeverse/lychee/discussions/1658.

There is a crate, [`nbformat`](https://github.com/runtimed/runtimed/tree/main/crates/nbformat), which would allow us to extract the Markdown cells from a Jupyiter file (`.ipynb`). This way, we could use a proper Markdown parser for link extraction.

If anyone wants to contribute, take a look at the [markdown extractor](https://github.com/lycheeverse/lychee/blob/master/lychee-lib/src/extract/markdown.rs). The new "notebook extractor" would look quite similar. It would use `nbformat` to get all notebook cells and then call the Markdown extractor to find all links.

Help wanted! Comment here if you want to give it a shot or send in a pull request. ✌ 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Links in Jupyter Notebooks? #1659

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Support Links in Jupyter Notebooks? #1659

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions