[Tagdb] Tags and data storage
Philipp Keller
phred at citrin.ch
Thu Mar 23 07:59:35 GMT 2006
> For someone starting out with a tagging app and unsure of how it will take
> off, does it make sense to start with Lucene if they have no working
> knowledge of how to use it or does it make more sense to start with RDBMS
> and then move to Lucene as it grows?
Yeah, that's the big question. You never know how fast it grows, do you?
In the case of delicious: I'm almost certain they started with MySQL and
then had to switch to a non-RDBMS system. They had to stop feature
rollout for about one year so, yeah, they should have started with a
system like Lucene.
To Otis and Erik: Can one of you write an article about "how do I build
a tag app using lucene"? I thought about investigating into lucene and
write an article myself but it'd be easier if you would do that job with
all your knowledge.. :-)
greets
Philipp
>
>
>
> -----Original Message-----
> From: tagdb-bounces at lists.tagschema.com
> [mailto:tagdb-bounces at lists.tagschema.com] On Behalf Of
> ogjunk-tagdb at yahoo.com
> Sent: Wednesday, March 22, 2006 2:02 PM
> To: tagdb at lists.tagschema.com
> Subject: Re: [Tagdb] Tags and data storage
>
> Nitin,
>
> Inline answers...
>
> ----- Original Message ----
>
> For the searching part I understand but is it possible to do things like the
> following with Lucene ?
>
> * show me all the tags used by userid x
> * show me the userids for people who have tagged item y with tag z
>
> OG: +yes +yes ... or.... yes AND yes
> OG: stupid joke.
>
> And how do I update the data as the tagging of items by users continues ?
>
> OG: Lucene is a java library with an API, so you use that API to update the
> Lucene index.
>
> Or do I have to maintain two copies of the data - one in the db and one in
> structured text files indexed by Lucene ?
>
> OG: How/where you store the data is up to you. You could store everything
> in Lucene, there are no relations in Lucene, as there are in RDBMS.
>
> Or am I essentially living off a database based on text files ?
>
> OG: text files, no. Inverted indices, yes.
>
> My knowledge of Lucene is limited to text search so pardon the "stupid"
> questions.
>
> OG: I hear the dudes wrote a book about Lucene and provided free code. ;)
>
> Otis
>
>
> ogjunk-tagdb at yahoo.com wrote:
>
> >Again, what Philipp said. Except for the MySQL full-text search piece.
> Don't go there, unless you LOVE large database files.
> >
> >Otis
> >
> >----- Original Message ----
> >From: Philipp Keller <phred at citrin.ch>
> >To: Joshua Lippiner <jlippiner at yahoo.com>
> >Cc: tagdb at lists.tagschema.com
> >Sent: Wednesday, March 22, 2006 12:08:09 PM
> >Subject: Re: [Tagdb] Tags and data storage
> >
> >
> >
> >
> >>Does anyone have any recommendations/thoughts on tag storage? Are you
> >>better off storing an entire list of tags associated with one item
> >>into a single field and then dealing with searching issues later or,
> >>in the end, do the benefits outweight the size issue by storing each
> >>tag as a new dB row?
> >>
> >>
> >I once wrote an article, which shows different solutions to the problem
> >[1], and I also did performance tests [2]
> >
> >As Nitin noticed: You'll probably get problems with the denormalized
> >version. I suppose you won't save much space if you go for the
> >denormalized version.
> >
> >If you just have a small user base and database then I think you
> >shouldn't do the 3nf solution because it's hard to deal with the tag
> >orphans.. the MySQL fulltext variant looks good in my eyes.
> >
> >About the scalability issue Nitin noticed: If you have more than 1
> >million tagged entries you have to switch from RDBMS to, say, Lucene
> >anyway, no matter which way you organize tags in you DB
> >
> >greets
> >Philipp
> >
> >[1] http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
> >[2]
> >http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.h
> >tml
> >
> >
> >
> >_______________________________________________
> >Tagdb mailing list
> >Tagdb at lists.tagschema.com
> >http://lists.tagschema.com/mailman/listinfo/tagdb
> >
> >
> >
> >_______________________________________________
> >Tagdb mailing list
> >Tagdb at lists.tagschema.com
> >http://lists.tagschema.com/mailman/listinfo/tagdb
> >
> >
>
>
>
>
> _______________________________________________
> Tagdb mailing list
> Tagdb at lists.tagschema.com
> http://lists.tagschema.com/mailman/listinfo/tagdb
>
> _______________________________________________
> Tagdb mailing list
> Tagdb at lists.tagschema.com
> http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
More information about the Tagdb
mailing list