[Tagdb] Tags and data storage

Nitin Borwankar nitin at borwankar.com
Fri Mar 24 21:11:12 GMT 2006


Erik Hatcher wrote:

>
> On Mar 24, 2006, at 1:17 PM, Nitin Borwankar wrote:
> [...]
>
>> Also what exactly is "partial indexing".
>
>
> Good question.  Perhaps it means that the data is stored in a  
> relational database in a normalized granular fashion, but indexed  
> into something like Lucene also for more sophisticated, faster, and  
> scalable querying.
>
I think that's a pattern that should be explored further - there's 
reasons to stick with RDBMS for part of the data - especially tiered 
account management,  resource quotas etc., where
record base data makes sense but to export periodically to (a) Lucene 
index(es) which can serve
a)  RSS feeds for different tag combinations
b) combinations of  raw text search and tag search - the search "tagged 
with x, y, z AND containing <some phrase>" seems to be a very useful 
feature to support.

 

> It's a shame the underpinnings of del.icio.us are obscure, though I  
> can appreciate why that is as well.
>
and of Flickr, Technorati, ..... ( Simpy? :-) )

More often than not the reasons for obscurity are that things have 
organically grown and may seem ugly to the outsider. There are usually 
historical reasons for the ugliness which the creator doesn't have time 
to explain in detail.  Nevertheless I would like to see more knowledge 
sharing of how to do tag backends from the people who do them.
I'd like to see open sourced schemas and API's to do the same.  and 
various trade-offs that people have made in practice. Practitioners 
sharing more real world experience on how things could be done better. 

I am initiating a project on SourceForge called Tagomycin ( "cure for 
the common tag" :-) ) for a "drop in tagging framework" for applications 
that need to add tagging, or want to have a tagging framework built in 
from the start.  Schema's, API's and language implementations will all 
be open sourced.  First implementation will probably be in Python on 
MySQL, then probably
PHP then additional databases - PostgreSQL, SQLite3. I have an initial 
implementation in Java for a client that I cannot open source but the 
schema is open source-able and someone else can look at my Python 
implementation and do a Java implementation if needed. Volunteers welcome.

Nitin Borwankar










>     Erik
>



More information about the Tagdb mailing list