[Tagdb] vertical search, strong typing and multi-field queries

Erik Hatcher esh6h at virginia.edu
Fri Feb 3 10:35:41 GMT 2006


This is an interesting discussion.   Here's a real-world examples to  
consider from one of my current projects, the search system for the  
Rossetti Archive - the archive of the works of Dante Gabriel Rossetti:

	http://www.rossettiarchive.org/rose/

The scholars that I work with originally desired a multi-fielded  
situation, as you see below the free-form search.  My desire has been  
make the fielded searches unnecessary and inferior to the free-form  
search box (notice there is still a desire to have a field for case  
sensitivity (why that is still baffles me :) and object type constraint.

For the (Lucene, of course!) indexes built, I've done extensive  
merging of all fielded data into an aggregate field to allow the free- 
form search to work by just typing in what you're searching for  
without the need for using any of the special search operators (which  
can be used if desired).  By tuning document boosting, score factors,  
and adding search result highlighting to facilitate post-search human  
filtering I've been able to satisfy some very discerning scholars.   
New scenarios will certainly arise that require re-tuning (or re- 
educating the users perhaps), but I think it is a laudable goal to  
aim for free-form searching such that users can just type what  
they're thinking of and find it, even in highly specialized vertical  
search applications such as this one.  Keep in mind that my domain is  
intensely about words and their colorful interactions with the rest  
of the world, including the users themselves, so I am hypersensitive  
to this issue for this particular application.

In the several applications where I've integrated search, each search  
interface has been unique.  I've become more and more keen on  
"interface driven design" where the desired interface dictates the  
implementation details of the underlying software, including the  
index structure itself.

	Erik


On Feb 2, 2006, at 5:22 PM, Nitin Borwankar wrote:

> Otis,
>
> You bring up an important point - that of a specialized search  
> syntax labeled "advanced search".
>
> I would posit that filling in multiple fields is far easier/less  
> effort than using the "advanced" syntax which varies from site to  
> site and the effort of learning that is clearly not justified  
> unless you are planning to do many such searches. I fully  
> understand why users don't want to use such an "advanced" search  
> and would find even 2% a large number.
>
> Providing a special syntax that fits into one field ... people seem  
> to believe that some how that's a useful thing.  One field on a  
> horizontal search must of need be typeless. Overloading that with  
> syntax to allow typed search .... why not provide a multi-field  
> form as well ....
>
> You mention 1 field != unstructured.  I have no position on what is  
> structured or not -- I am talking very specifically about typed  
> data underlying the search and don't want to fuzzy the discussion  
> by replacing "typed" with "structured".
>
> You mention general web search engines vs. services people really  
> need - these are not "either or".
> I really need Google and I use but don't really need a vertical  
> podcast search on Odeo.
>
> This is not about minimizing the importance of Google - Google is a  
> very valuable search metaphor but *not the only one*.
> The needs of vertical search are necessarily different and trying  
> to provide a one-size fits all metaphor is futile, IMO.
>
> About multiple field queries - let's look at the history of that  
> --- relational databases have been around since the 80's.
> One of the first "applications" on top of RDBMS's was QBF or Query- 
> By-Forms which is a search with a multi-field query form as input.
> These applications predate the Web, the Internet, Linux, the PC,  
> windowing systems and even client-server architectures.
>
> These application ran on mainframes with character terminals - back- 
> office personnel - telephone support, reservations agents etc. in  
> all service industries have been using these for the last 20 years  
> or more.
> These users are not "technical" users, they are the people who  
> answer the phones when you call your utility company to discuss  
> your bill.
>
> Now consider that to use Google one has to be minimally computer  
> literate and has probably filled out more than one computer form to  
> sign up for an Internet account, for an email account ...  One is  
> able to fill out the To: and Subject: lines on email on a regular  
> basis. Considering all this would you still suggest that multiple  
> fields are "too hard" for the user ?
>
> Nitin
>
>
>
> ogjunk-tagdb at yahoo.com wrote:
>
>> Copies/pasted that email again...
>>
>> General web search engine and a service that people _really_ need  
>> are different.  People may be willing to spend more time entering  
>> information in multiple fields there in order to get high quality  
>> results.  Because the former gives SO many results, it's hard to  
>> judge whether the top hits are really the best matches, and you  
>> get enough hits, that entering a "less structured" is ok.
>>
>> Also keep in mind that 1 field != unstructured.
>> See Simpy for example: http://www.simpy.com/simpy/FAQ.do#searchSyntax
>> http://www.simpy.com/simpy/FAQ.do#searchFieldsLinks
>>
>> One input field, but many search fields, and many search operators  
>> == lots of search power.
>>
>> Or, create what most other services create - simple + advanced  
>> search, and monitor their use.  I think I recall reading something  
>> on Tim Bray's blog about only 2% of users using advanced  
>> search.... but that was several years ago.
>>
>> Otis
>>
>> ----- Original Message ----
>> From: Nitin Borwankar <nitin at borwankar.com>
>> To: tagdb at lists.tagschema.com
>> Sent: Thu 02 Feb 2006 01:50:52 PM EST
>> Subject: [Tagdb] vertical search, strong typing amd multi-field  
>> queries
>>
>> Slightly unrelated to tags/folksonomy but only slightly.
>>
>> In my current consulting a recurring issue is that of search.
>> (A tag bundle can be used as a filter to reduce search scope and  
>> then we can do raw text or other searches.  But that is a separate  
>> topic .... )
>>
>> I'd like to ask the list about the current dogma about providing a  
>> single search field with the belief that non-technical users find  
>> multiple fields too complicated and they will "leave" -  this  
>> belief is predominant in the major search engine companies.  There  
>> is also a UI dogma that simplicity is better and this dogma often  
>> dominates common sense approaches driven by the structure of  
>> underlying data.
>>
>> See www.simplyhired.com for an example of a site that has moved  
>> away ( slightly ) from the single field dogma.
>> I find the single query field appropriate for horizontal search  
>> engines like Google etc. where the query could be about anything.
>> But what if I am at a site that has say book information - does it  
>> make sense to provide a single field or does a set of fields title/ 
>> author/publisher/....
>> make more sense?  When we have domain specialization for search  
>> i.e. vertical search, is it automatic that we also have strong  
>> typing of underlying data and hence multi-field search? Does a  
>> single field ( typeless query ) make sense for a vertical search  
>> engine.
>>
>>
>> Nitin Borwankar.
>>
>>
>>
>>
>> _______________________________________________
>> Tagdb mailing list
>> Tagdb at lists.tagschema.com
>> http://lists.tagschema.com/mailman/listinfo/tagdb
>>
>
> _______________________________________________
> Tagdb mailing list
> Tagdb at lists.tagschema.com
> http://lists.tagschema.com/mailman/listinfo/tagdb



More information about the Tagdb mailing list