Select your language

Home arrow-right Apache Nutch

We've compiled a list of 6 free and paid alternatives to Apache Nutch. The primary competitors include Scrapy, Mixnode. In addition to these, users also draw comparisons between Apache Nutch and StormCrawler, ProxyCrawl, ACHE Crawler. Also you can look at other similar options here: About.


Scrapy
Free Open Source

Scrapy is an open source and collaborative framework for extracting the data you need from websites.

StormCrawler
Free Open Source

StormCrawler is an open source SDK for building distributed web crawlers with Apache Storm.

Scraping and crawling websites while being anonymous and bypass any restriction, blocks or captchas

Heritrix
Free Open Source

The Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Apache Nutch is a highly extensible and scalable open source web crawler software project.

Apache Nutch Platforms

tick-square Windows
tick-square Linux
tick-square Mac

Apache Nutch Overview

Apache Nutch is a highly extensible and scalable open source web crawler software project.

Nutch is coded entirely in the Java programming language, but data is written in language-independent formats. It has a highly modular architecture, allowing developers to create plug-ins for media-type parsing, data retrieval, querying and clustering.

The fetcher ("robot" or "web crawler") has been written from scratch specifically for this project.

Apache Nutch Features

tick-square Scalable
tick-square Extensible by Plugins/Extensions

Top Apache Nutch Alternatives

Share your opinion about the software, leave a review and help make it even better!

Apache Nutch Tags

web-scraper web-crawling web-crawler java-based

Suggest Changes

Your Feedback

Select a rating
Please select a rating

Your vote has been counted.

Do you have experience using this software?