[Tagdb] question about solr vs nutch

Nitin Borwankar nitin at borwankar.com
Wed Nov 15 05:47:07 GMT 2006


ogjunk-tagdb at yahoo.com wrote:

>That's a bit of an apples and oranges comparison.  Ian already pointed out the most obvious/basic/biggest difference.  They are meant to solve different problems.  Moreover, if you play with Nutch, you will see that's a rather complex and ambitious piece of software.  Solr is a lot smaller (code-wise) and simpler.  Again, it's hard to compare them, because they are really two pretty different things, even though they both do text indexing and searching.
>  
>

OK, I'll ask the question elsewhere but what I was asking was about the 
overhead of the web-service submission not the backend or the functional 
differences.

Nitin

>Otis (Lucene/Solr/Nutch developer)
>
>----- Original Message ----
>From: Nitin Borwankar <nitin at borwankar.com>
>To: tagdb at lists.tagschema.com
>Sent: Tuesday, November 14, 2006 6:10:13 PM
>Subject: [Tagdb] question about solr vs nutch
>
>Hi all,
>
>As there are some experts in text indexing on the list thought this 
>might be the best place to ask ....
>I see that solr ( http://incubator.apache.org/solr/ ) is an enterprise 
>search engine based on Lucene with a web-service api for submitting docs 
>to be indexed.
>Also that Nutch ( www.nutch.org )  is another search engine based on 
>Lucene which directly stores docs to disk before indexing.
>What is the performance hit of submitting docs by web-service in 
>comparison to the nutch approach, if at all this is a comparison that 
>makes sense.
>My interest is in the fielded search capabilities of solr, applied to 
>either LAN based docs or docs crawled from the web, but I am concerned 
>about the performance hit of
>web-service submission + XML overhead compared to direct disk writes.
>
>Any enlighteneing thoughts ?
>
>Nitin Borwankar
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
>
>  
>



More information about the Tagdb mailing list