[Tagdb] Multi-Word Tags Vs Single Word Tags
Colin Viebrock
cviebrock at tucows.com
Fri Apr 7 14:29:27 GMT 2006
I understand your point, but in English the phrase is "hot dog".
Making users enter "hot_dog" or "hotdog" isn't user-centric. Your
forcing them to do something counter-intuitive.
What about allowing them to enter all their tags, separated by commas.
So, if on my recent trip to London I found a great hot dog restaurant,
I'd tag it:
hot dog, trip to london, excellent restaurant
You could then break that into the phrases:
hot dog
trip to london
excellent restaurant
This is how the user wants to describe their document, so let them.
But lets try and add some "better" tagging to it. This is the tricky
part. :)
Look for any "stop words" in those phrases, and remove them. If the
stop word is in the middle of a phrase, break that phrase into it's
parts. So the "to" in "trip to london" is removed, and the phrases
left are:
hot dog
trip
london
excellent restaurant
You could then break the phrases on whitespace, except if you are
breaking a compound word. This would require a lookup list of known
compound words that shouldn't be broken ... and therefore might be very
difficult to actually do.
Assuming you could, "hot dog" would stay, but "excellent restaurant"
would be split, leaving you:
hot dog
trip
london
excellent
restaurant
Personally, I'd wonder about the usefullness of "excellent" as a tag
when it's out of context. But this was just a mind exercise for me
anyway. :)
- Colin
On 6-Apr-06, at 11:29 AM, anand wrote:
>>> Yeah, but what about someone who wants to tag a document with "hot
>>> dog"?
> This is exactly what I meant when I wrote the following:
> "I would use 'java_rmi' only incase the two words are inseparable and
> have
> no mean indepently."
>
> 'hot dog' is an entirely different entity which is formed by combining
> hot
> and dog. Therefore user would be inclined to use it as a single word
> of the
> form 'hotdog' or 'hot_dog'. But 'london' 'trip' is formed by two
> different
> words having the same semantics when used in multi words or even as a
> single
> word.
>
>
> On 4/6/06, Colin Viebrock <colin at tucows.com> wrote:
>>
>>> For instance if I have to
>>> tag a document with tags java and rmi, I would rarely go ahead and
>>> tag
>>> it as
>>> 'java_rmi' but rather I would tag it as 'java' and 'rmi'. I would use
>>> 'java_rmi' only incase the two words are inseparable and have no mean
>>> indepently.
>>>
>>> Now with multi-word tags users can use entire phrases like 'a trip to
>>> london' to tag items which they would have tagged as 'london' 'trip'
>>> incase
>>> of single word tags. This makes the tagged item difficult to
>>> discover,
>>> making the tag and thus the user to tagged item relationship
>>> non-social
>>> (quite opposite to what a tag is supposed to do).
>>
>> Yeah, but what about someone who wants to tag a document with "hot
>> dog"?
- Colin
>>
>
>
> --
> - Andie
More information about the Tagdb
mailing list