Automatic categorization of content

I am developing a script that retrieves messages from the message archive of a specific meetup.com group of which I am a member. http://www.meetup.com/opencoffee/messages/archive/

The idea is to dynamically add them to the wordpress site and let people search for messages, auto-tag messages, etc.

I have a question about how best to automatically classify these posts. I would welcome any thoughts and ideas on how best to do this, and what would be the most efficient way to program this.

Option 1

Find a tag source by subject area, such as finance, technology, business, etc. using the delicious API and find related tags by topic: -

http://delicious.com/tag/finance

http://delicious.com/tag/technology

if the message contains these tags, then the message is assigned to the appropriate category.

I believe this might work, but I'm not sure what the most efficient message scanning method for these tags is.

Option 2

Find sites that are the categories I need, such as ft.com, finance economist, etc., techcrunch for technology, etc., and then determine which tags people use to tag these sites and determine by default that those tags as people relate to these sites and their content stack.

Option 3

URL- http://semanticproxy.com/ ( Reuters Calais) API Open Calais. , , .

, calais api: -

http://www.meetup.com/opencoffee/messages/6045615/

http://www.mashinteractive.com/opencoffee/calais.php

. , 1 2.

FYI 1700 , , 10 , 20 30 .

- Wordpress , , . , , , .

+3
1

Zemanta, ( Wordpress) , , , RDFa, -, .

+1

Source: https://habr.com/ru/post/1707601/


All Articles