[Tagdb] Tags and data storage

Nitin Borwankar nitin at borwankar.com
Wed Mar 22 21:10:26 GMT 2006


OK, so I'll bite - I've heard a few mentions about using Lucene in tag 
based apps, and am curious.

For the searching part I understand but is it possible to do things like 
the following with Lucene ?

* show me all the tags used by userid x
* show me the userids for people who have tagged item y with tag z


And how do I update the data as the tagging of items by users continues ?
Or do I have to maintain two copies of the data - one in the db and one 
in structured text files indexed by Lucene ?
Or am I essentially living off a database based on text files ?

My knowledge of Lucene is limited to text search so pardon the "stupid" 
questions.

Nitin Borwankar



ogjunk-tagdb at yahoo.com wrote:

>Again, what Philipp said.  Except for the MySQL full-text search piece.  Don't go there, unless you LOVE large database files.
>
>Otis
>
>----- Original Message ----
>From: Philipp Keller <phred at citrin.ch>
>To: Joshua Lippiner <jlippiner at yahoo.com>
>Cc: tagdb at lists.tagschema.com
>Sent: Wednesday, March 22, 2006 12:08:09 PM
>Subject: Re: [Tagdb] Tags and data storage
>
>
>  
>
>>Does anyone have any recommendations/thoughts on tag storage?  Are you
>>better off storing an entire list of tags associated with one item
>>into a single field and then dealing with searching issues later or,
>>in the end, do the benefits outweight the size issue by storing each
>>tag as a new dB row?
>>    
>>
>I once wrote an article, which shows different solutions to the problem
>[1], and I also did performance tests [2]
>
>As Nitin noticed: You'll probably get problems with the denormalized
>version. I suppose you won't save much space if you go for the
>denormalized version.
>
>If you just have a small user base and database then I think you
>shouldn't do the 3nf solution because it's hard to deal with the tag
>orphans.. the MySQL fulltext variant looks good in my eyes.
>
>About the scalability issue Nitin noticed: If you have more than 1
>million tagged entries you have to switch from RDBMS to, say, Lucene
>anyway, no matter which way you organize tags in you DB
>
>greets
>Philipp
>
>[1] http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
>[2]
>http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>  
>



More information about the Tagdb mailing list