Evaluate sample class with WEKA

I created a model in Weka using the SMO algorithm. I am trying to evaluate a test sample using the specified model in order to classify it in my problem with two classes. I am a little confused about how to evaluate a sample using Weka Smo code. I created an empty arff file that contains only the file metadata. I compute the samples and add the vector to the arff file. I created the following Evaluate function to evaluate a sample. The template.arff file is a template containing metadata for the arff file and the / smo model of my model.

public static void Evaluate(ArrayList<Float> temp) throws Exception { temp.add(Float.parseFloat("1")); System.out.println(temp.size()); double dt[] = new double[temp.size()]; for (int index = 0; index < temp.size(); index++) { dt[index] = temp.get(index); } double data[][] = new double[1][]; data[0] = dt; weka.classifiers.Classifier c = loadModel(new File("models/"), "/smo"); // loads smo model File tmp = new File("template.arff"); //loads data template Instances dataset = new weka.core.converters.ConverterUtils.DataSource(tmp.getAbsolutePath()).getDataSet(); int numInstances = data.length; for (int inst = 0; inst < numInstances; inst++) { dataset.add(new Instance(1.0, data[inst])); } dataset.setClassIndex(dataset.numAttributes() - 1); Evaluation eval = new Evaluation(dataset); //returned evaluated index double a = eval.evaluateModelOnceAndRecordPrediction(c, dataset.instance(0)); double arr[] = c.distributionForInstance(dataset.instance(0)); System.out.println(" Confidence Scores"); for (int idx = 0; idx < arr.length; idx++) { System.out.print(arr[idx] + " "); } System.out.println(); } 

I'm not sure I'm here. I am creating a sample file. Subsequently, I upload my model. I wander if my code is what I need to evaluate the temp class of the sample. If this code is fine, how can I extract a confidence score, not a binary class decision? The structure of the template.arff file:

 @relation Dataset @attribute Attribute0 numeric @attribute Attribute1 numeric @attribute Attribute2 numeric ... @ATTRIBUTE class {1, 2} @data 

In addition, the loadModel function is as follows:

 public static SMO loadModel(File path, String name) throws Exception { SMO classifier; FileInputStream fis = new FileInputStream(path + name + ".model"); ObjectInputStream ois = new ObjectInputStream(fis); classifier = (SMO) ois.readObject(); ois.close(); return classifier; } 

I found this post here , which suggests finding the SMO.java file and changing the following line smo.buildClassifier(train, cl1, cl2, true, -1, -1); // from false to true. However, it seems that when I did this, I got the same binary output.

My learning function:

  public void weka_train(File input, String[] options) throws Exception { long start = System.nanoTime(); File tmp = new File("data.arff"); TwitterTrendSetters obj = new TwitterTrendSetters(); Instances data = new weka.core.converters.ConverterUtils.DataSource( tmp.getAbsolutePath()).getDataSet(); data.setClassIndex(data.numAttributes() - 1); Classifier c = null; String ctype = null; boolean newmodel = false; ctype = "SMO"; c = new SMO(); for (int i = 0; i < options.length; i++) { System.out.print(options[i]); } c.setOptions(options); c.buildClassifier(data); newmodel = true; if (newmodel) { obj.saveModel(c, ctype, new File("models")); } } 
+5
source share
2 answers

Basically, you should try to use the “-M” option for SMO to match the logistic models in the training process. Check out the proposed solution here . It should work!

+2
source

I have some suggestions, but I have no idea if they will work. Let me know if this works for you.

First, use SMO, not just the classifier of the parent classifier. As an example, I created a new loadModelSMO method.

SMO class

 public static SMO loadModelSMO(File path, String name) throws Exception { SMO classifier; FileInputStream fis = new FileInputStream(path + name + ".model"); ObjectInputStream ois = new ObjectInputStream(fis); classifier = (SMO) ois.readObject(); ois.close(); return classifier; } 

and then

 SMO c = loadModelSMO(new File("models/"), "/smo"); ... 

I found an article that could help you from a mailing list topic called I used SMO with logistic regression, but I always get confidence in 1.0

It is suggested that you use -M according to your logistics model, which can be used using the method

 setOptions(java.lang.String[] options) 

Also, perhaps you need to set the assembly logistics model to true Confidence in SMO

 c.setBuildLogisticModels(true); 

Let me know if this helped at all.

+3
source

Source: https://habr.com/ru/post/1210392/


All Articles