A fork of https://github.com/dfm/feedfinder2
Adding functionality to validate the activity of feeds
This is a Python library for finding links to feeds on a website.
It is based on feedfinder - originally written by Mark Pilgrim and subsequently maintained by Aaron Swartz until his untimely death.
Feedfinder2 offers a single public function: find_feeds. You would use it
as follows:
from feedfinder3 import find_feeds
feeds = find_feeds(
"xkcd.com",
validate_options={
"min_article_count": 1, # feed should have at least 1 article
"max_day_interval": 30, # feed should be updated in the last 30 days
"exclude_keywords": ['comments', 'jobs'] # exclude feed urls, containing "comments" or "jobs"
}
)
Now, feeds is the list: ['http://xkcd.com/atom.xml',
'http://xkcd.com/rss.xml']. There is some attempt made to rank feeds from
best candidate to worst but... well... you never know.
Feedfinder2 is licensed under the MIT license (see LICENSE).