flairNLP / fundus Star 424 Code Issues Pull requests A very simple news crawler with a funny name python nlp rss sitemap crawler scraper corpus text-extraction web-scraping image-classification datasets news-crawler corpus-tools commoncrawl web-corpus news-scraping cc-news image-extraction Updated Dec 16, 2025 Python
liaoziyang / OpenIE-Spider Star 174 Code Issues Pull requests Extract Information from web corpus using Open Information Extraction. fragments extract-information sentence web-corpus Updated Jul 21, 2017 Python