There are several improvements.
/1. Improvised skillset and contextual feeling analysis : Some features can be classified as positive in the context of movie review, but may be negative in the context of product review. You retrain your data in your context. The specified method here
Models can be retrained using the following command using the PTB data format:
java -mx8g edu.stanford.nlp.sentiment.SentimentTraining -numHid 25 -trainPath train.txt -devPath dev.txt -train -model model.ser.gz
A good discussion of the training kit can be found here .
/ 2. Obtaining contextual training and testing data : Product review data can act as a training kit as well as a test suite. Select reviews with extreme polarities (1 POOREST star and 5 GREAT stars) as your training data, to further promote the content, you can choose 1 and 5 star reviews that have been flagged as useful to the community. Using this data, your PTB dataset classified the reviews as POSITIVE and NEGATIVE (neutral would be difficult to achieve using ratings with a rating of 2-3-4 stars, as they can introduce noise).
/ 3. Use 80% of your data set as a training set and 20% as a test set. 1 star ratings are rated mostly NEGATIVE, and 5 stars are mostly classified as positive. Submit this, you can use the trained model to analyze the moods of other reviews, your mood rating (say 0 for negative moods and 5 for very positive moods, or -1 for a negative value +1 for very positive) will have a positive correlation with the actual star rating presented with this review. If there is a mismatch of feelings , for example. the text review comes out with a positive attitude, but has a rating of 1 star, you can register such cases and improvise your classification.
/4. Improvisation using other data sources and classifiers : Vader moods (in python) are a very good classifier, specially tuned for social media and the like product reviews. You may or may not use it as a comparative classifier (to match or get a double set of your results from corenlp + vader), but you can use its amazon review data set, as indicated here :
amazonReviewSnippets_GroundTruth.txt FORMAT: tab delimited file with ID, MEAN-SENTIMENT-RATING and TEXT-SNIPPET
DESCRIPTION: Includes 3,708 level offer snippets from 309 customer reviews on 5 different products. These reviews were originally used by Hu and Liu (2004); we added mood intensity estimates. The identifier and MEAN-SENTIMENT-RATING correspond to the initial mood rating data presented in 'amazonReviewSnippets_anonDataRatings.txt' (described below).
amazonReviewSnippets_anonDataRatings.txt FORMAT: the file is a tab limited by identifier, RATING MEDIA, STANDARD REJECTION and RAW mood-RATINGS
DESCRIPTION: mood assessments from at least 20 independent human ones (all preliminary checks, training and quality are checked for optimal mutual reliability).
Datasets are available in the tgz file: https://github.com/cjhutto/vaderSentiment/blob/master/additional_resources/hutto_ICWSM_2014.tar.gz
This follows the reviewindex_part polarity review_snippet
1_19 -0.65 the button was probably accidentally pushed to cause the black screen in the first place. 1_20 2.85 but, if you're looking for my opinion of the apex dvd player, i love it! 1_21 1.75 it practically plays almost everything you give it.