[Tagdb] tags and keywords - navigating folksonomies and text search
Nitin Borwankar
nitin at borwankar.com
Mon Feb 5 19:46:19 GMT 2007
Enis Soztutar wrote:
>>Currently how do people handle this in terms of implementation - do you
>>do a tag search first and then run a keyword search over those docs? The
>>other way around? What are the pro's and con's?
>>
>>If you are implementing tags in a relational database how does your tag
>>search interact with your keyword search?
>>Practioners here's a chance to share your hard won wisdom and win
>>community glory in return ;-)
>>
>>
>>
>Hi Nitin,
>
>For the implementation part, it is obvious that db(for tagging) plus
>keyword search(inverted index) will not work efficiently. For example,
>you should query the db for every url, that you will display as a result
>for a keyword search, to find the mostly used tags of that url.
>
But what about the other scenario where given a tag you get all item ids
that have the tag , I(t), from the db
then do a text search across these items ( docs ) only. The items are
not necessarily bookmarks, but could be docs
on disk, photos, video,... and the search is across the body of text or
across the description field for rich media?
In general I do agree that combined tag and keyword search is awkward
when using a db to manage tags.
>A
>reasonable solution will be to use the inverted index instead of the db
>for tagging. Simpy (www.simpy.com) uses lucene for this purpose.
>
>
>
Yes, this is where I was going with my earlier post and you'll have to
promise that I did not pay you to say that ;-).
I have been looking at the Lucene toolkit and the Solr search engine
along with Nutch the web crawler.
(See http://greener.com as one outcome of the Nutch experiment).
And I soon realized that for more flexible and extensible tagschemas we
could be using Lucene. Otis kept murmuring "you may not want to use a
db" a long time ago, but he never told us how he did it ;-). And back
then I didn't have the time to look at Lucene.
So on the tagschema blog I will be exploring ways to do the same things
as mentioned earlier for sql db's but with text search ( specifically
Lucene) , documents and fields, to manage tagging. Along with that
keyword search will also be included for free.
And who doesn't love "free" and "keyword search" all in one sentence ;-)
Nitin
>A search 2.0 engine can be built, by running the tagging system as
>usual, and indexing the urls with their common tags. I mean in the
>indexing phase of web crawling, tags of urls could be included in the
>main index as a separate field. These tags are then treated similar to
>anchor texts. Search query is extended to search the tags field also. A
>limitation for this approach is that, the indexing should be
>periodically updated, and the user information is not used. But if we
>know the querying user, then we can run the query in main index and the
>tag's index for the user, and then merge the results.
>
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
--
Nitin Borwankar
Find, Learn, Act ....
Greener, the search engine for the planet
http://greener.com
nitin at borwankar.com
510-872-7066
More information about the Tagdb
mailing list