[Tagdb] question about solr vs nutch
ogjunk-tagdb at yahoo.com
ogjunk-tagdb at yahoo.com
Wed Nov 15 05:28:50 GMT 2006
That's a bit of an apples and oranges comparison. Ian already pointed out the most obvious/basic/biggest difference. They are meant to solve different problems. Moreover, if you play with Nutch, you will see that's a rather complex and ambitious piece of software. Solr is a lot smaller (code-wise) and simpler. Again, it's hard to compare them, because they are really two pretty different things, even though they both do text indexing and searching.
Otis (Lucene/Solr/Nutch developer)
----- Original Message ----
From: Nitin Borwankar <nitin at borwankar.com>
To: tagdb at lists.tagschema.com
Sent: Tuesday, November 14, 2006 6:10:13 PM
Subject: [Tagdb] question about solr vs nutch
Hi all,
As there are some experts in text indexing on the list thought this
might be the best place to ask ....
I see that solr ( http://incubator.apache.org/solr/ ) is an enterprise
search engine based on Lucene with a web-service api for submitting docs
to be indexed.
Also that Nutch ( www.nutch.org ) is another search engine based on
Lucene which directly stores docs to disk before indexing.
What is the performance hit of submitting docs by web-service in
comparison to the nutch approach, if at all this is a comparison that
makes sense.
My interest is in the fielded search capabilities of solr, applied to
either LAN based docs or docs crawled from the web, but I am concerned
about the performance hit of
web-service submission + XML overhead compared to direct disk writes.
Any enlighteneing thoughts ?
Nitin Borwankar
_______________________________________________
Tagdb mailing list
Tagdb at lists.tagschema.com
http://lists.tagschema.com/mailman/listinfo/tagdb
More information about the Tagdb
mailing list