[Tagdb] question about solr vs nutch
Nitin Borwankar
nitin at borwankar.com
Tue Nov 14 23:10:13 GMT 2006
Hi all,
As there are some experts in text indexing on the list thought this
might be the best place to ask ....
I see that solr ( http://incubator.apache.org/solr/ ) is an enterprise
search engine based on Lucene with a web-service api for submitting docs
to be indexed.
Also that Nutch ( www.nutch.org ) is another search engine based on
Lucene which directly stores docs to disk before indexing.
What is the performance hit of submitting docs by web-service in
comparison to the nutch approach, if at all this is a comparison that
makes sense.
My interest is in the fielded search capabilities of solr, applied to
either LAN based docs or docs crawled from the web, but I am concerned
about the performance hit of
web-service submission + XML overhead compared to direct disk writes.
Any enlighteneing thoughts ?
Nitin Borwankar
More information about the Tagdb
mailing list