Before I can create a system that automatically classifies text, I need to manually classify a bunch of samples as a set for training / assessment. Is there any existing tool that will allow me to manually tag thousands of items without much trouble? And if not, what is the fastest way to hack something together?
As an example, suppose you have a bunch of Twitter posts. You would like to put them in certain buckets: happy, sad, funny, angry and spam. Some things go in several buckets. You can simply dump everything into a file and insert some tags with vi, but this is error prone and slow slowdown. More importantly, having a nice interface means you can talk to your colleagues about how to do a ton of work. The web, GUI, or console is not a big deal; just as fast and easy. Is there anything similar?
I hope so, although I can’t find anything with Google. If I need to build something, is there a good place to start? From rooting, my first impression is that Rails + jQuery + actions_as_taggable_on + jQuery Tokenizing Autocomplete looks fine, but I'm open to other things.
source
share