[Tagdb] Tags and data storage
Nitin Borwankar
nitin at borwankar.com
Thu Mar 23 00:29:28 GMT 2006
ogjunk-tagdb at yahoo.com wrote:
>Nitin,
>
>Inline answers...
>
>
OK, thanks very much - this is very interesting - but as you might
expect leads to some more questions :-)
a) How does a Lucene based app perform at the lower ends of the
scale - is there an overhead and a threshold above which Lucene makes
sense ?
b) How do I hook up my web app to a Lucene-tag-backend when my web app
is not written in Java ?
c) Are there commonly used JSON/XML-RPC etc. wrappers around the backend
so I can call it from Python/PHP/Ruby ?
Nitin
>----- Original Message ----
>
>For the searching part I understand but is it possible to do things like
>the following with Lucene ?
>
>* show me all the tags used by userid x
>* show me the userids for people who have tagged item y with tag z
>
>OG: +yes +yes ... or.... yes AND yes
>OG: stupid joke.
>
>And how do I update the data as the tagging of items by users continues ?
>
>OG: Lucene is a java library with an API, so you use that API to update the Lucene index.
>
>Or do I have to maintain two copies of the data - one in the db and one
>in structured text files indexed by Lucene ?
>
>OG: How/where you store the data is up to you. You could store everything in Lucene, there are no relations in Lucene, as there are in RDBMS.
>
>Or am I essentially living off a database based on text files ?
>
>OG: text files, no. Inverted indices, yes.
>
>My knowledge of Lucene is limited to text search so pardon the "stupid"
>questions.
>
>OG: I hear the dudes wrote a book about Lucene and provided free code. ;)
>
>Otis
>
>
>ogjunk-tagdb at yahoo.com wrote:
>
>
>
>>Again, what Philipp said. Except for the MySQL full-text search piece. Don't go there, unless you LOVE large database files.
>>
>>Otis
>>
>>----- Original Message ----
>>From: Philipp Keller <phred at citrin.ch>
>>To: Joshua Lippiner <jlippiner at yahoo.com>
>>Cc: tagdb at lists.tagschema.com
>>Sent: Wednesday, March 22, 2006 12:08:09 PM
>>Subject: Re: [Tagdb] Tags and data storage
>>
>>
>>
>>
>>
>>
>>>Does anyone have any recommendations/thoughts on tag storage? Are you
>>>better off storing an entire list of tags associated with one item
>>>into a single field and then dealing with searching issues later or,
>>>in the end, do the benefits outweight the size issue by storing each
>>>tag as a new dB row?
>>>
>>>
>>>
>>>
>>I once wrote an article, which shows different solutions to the problem
>>[1], and I also did performance tests [2]
>>
>>As Nitin noticed: You'll probably get problems with the denormalized
>>version. I suppose you won't save much space if you go for the
>>denormalized version.
>>
>>If you just have a small user base and database then I think you
>>shouldn't do the 3nf solution because it's hard to deal with the tag
>>orphans.. the MySQL fulltext variant looks good in my eyes.
>>
>>About the scalability issue Nitin noticed: If you have more than 1
>>million tagged entries you have to switch from RDBMS to, say, Lucene
>>anyway, no matter which way you organize tags in you DB
>>
>>greets
>>Philipp
>>
>>[1] http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html
>>[2]
>>http://www.pui.ch/phred/archives/2005/06/tagsystems-performance-tests.html
>>
>>
>>
>>_______________________________________________
>>Tagdb mailing list
>>Tagdb at lists.tagschema.com
>>http://lists.tagschema.com/mailman/listinfo/tagdb
>>
>>
>>
>>_______________________________________________
>>Tagdb mailing list
>>Tagdb at lists.tagschema.com
>>http://lists.tagschema.com/mailman/listinfo/tagdb
>>
>>
>>
>>
>
>
>
>
>_______________________________________________
>Tagdb mailing list
>Tagdb at lists.tagschema.com
>http://lists.tagschema.com/mailman/listinfo/tagdb
>
>
More information about the Tagdb
mailing list