[Tagdb] Tags and data storage

Joshua Lippiner jlippiner at yahoo.com
Wed Mar 22 22:16:06 GMT 2006


So the big question now is...

For someone starting out with a tagging app and unsure of how it will take
off, does it make sense to start with Lucene if they have no working
knowledge of how to use it or does it make more sense to start with RDBMS
and then move to Lucene as it grows?

 

-----Original Message-----
From: tagdb-bounces at lists.tagschema.com
[mailto:tagdb-bounces at lists.tagschema.com] On Behalf Of
ogjunk-tagdb at yahoo.com
Sent: Wednesday, March 22, 2006 2:02 PM
To: tagdb at lists.tagschema.com
Subject: Re: [Tagdb] Tags and data storage

Nitin,

Inline answers...

----- Original Message ----

For the searching part I understand but is it possible to do things like the
following with Lucene ?

* show me all the tags used by userid x
* show me the userids for people who have tagged item y with tag z

OG: +yes +yes  ... or.... yes AND yes
OG: stupid joke.

And how do I update the data as the tagging of items by users continues ?

OG: Lucene is a java library with an API, so you use that API to update the
Lucene index.

Or do I have to maintain two copies of the data - one in the db and one in
structured text files indexed by Lucene ?

OG: How/where you store the data is up to you.  You could store everything
in Lucene, there are no relations in Lucene, as there are in RDBMS.

Or am I essentially living off a database based on text files ?

OG: text files, no.  Inverted indices, yes.

My knowledge of Lucene is limited to text search so pardon the "stupid" 
questions.

OG: I hear the dudes wrote a book about Lucene and provided free code. ;)

Otis


ogjunk-tagdb at yahoo.com wrote:

>Again, what Philipp said.  Except for the MySQL full-text search piece.
Don't go there, unless you LOVE large database files.
>
>Otis
>
>----- Original Message ----
>From: Philipp Keller <phred at citrin.ch>
>To: Joshua Lippiner <jlippiner at yahoo.com>
>Cc: tagdb at lists.tagschema.com
>Sent: Wednesday, March 22, 2006 12:08:09 PM
>Subject: Re: [Tagdb] Tags and data storage
>
>
>  
>
>>Does anyone have any recommendations/thoughts on tag storage?  Are you 
>>better off storing an entire list of tags associated with one item 
>>into a single field and then dealing with searching issues later or, 
>>in the end, do the benefits outweight the size issue by storing each 
>>tag as a new dB row?
>>    
>>
>I once wrote an article, which shows different solutions to the problem 
>[1], and I also did performance tests [2]
>
>As Nitin noticed: You'll probably get problems with the denormalized 
>version. I suppose you won't save much space if you go for the 
>denormalized version.
>
>If you just have a small user base and database then I think you 
>shouldn't do the 3nf solution because it's hard to deal with the tag 
>orphans.. the MySQL fulltext variant looks good in my eyes.
>
>About the scalability issue Nitin noticed: If you have more than 1 
>million tagged entries you have to switch from RDBMS to, say, Lucene 
>anyway, no matter which way you organize tags in you DB
>
>greets
>Philipp
>
>[1] http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
>[2]
>http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.h
>tml
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>  
>




_______________________________________________
Tagdb mailing list
Tagdb at lists.tagschema.com
http://lists.tagschema.com/mailman/listinfo/tagdb



More information about the Tagdb mailing list