Nutch 0.8, Map & Reduce, here I come!
Posted by Kelvin on 09 Aug 2006 at 03:39 pm | Tagged as: Lucene / Solr / Elasticsearch / Nutch, crawling, programming
Finally taking the plunge to Nutch 0.8 after exclusively working with 0.7 for over a year (and something like 5 projects).
From initial experiences, it appears that using M&R does obfuscate the code somewhat for a developer who wants to build an app off the Nutch infrastructure instead of using it out-of-box. For example, trying to decipher what's going on in org.apache.nutch.indexer.Indexer is pretty difficult, compared to its 0.7 counterpart (IndexSegment).
Some serious documentation needs to be done around the implementation details of M&R. I'll keep posting about my learnings..
2 Responses to “Nutch 0.8, Map & Reduce, here I come!”
Hi
Nutch use Lucene technology but HAloop helpful to upgrade our search.Lot of API we can see @ New Nutch 8.0
Best Regards
Manisekaran
http://www.eworldtechnologies.com
yeah, it would be nice 🙂