How to load .arff file in matlab

Is there any package to load .arff file into matlab? The .arff format is used by Weka to run the machine learning algorithm.

+6
source share
7 answers

Yes, there are several MATLAB interfaces for WEKA files on MATLAB File Exchange, I usually use this: http://www.mathworks.com/matlabcentral/fileexchange/21204-matlab-weka-interface , where you have saveARFF () and loadARFF ().

+4
source

Since Weka is a Java library , you can directly use the API that it provides to read ARFF files:

%## paths WEKA_HOME = 'C:\Program Files\Weka-3-7'; javaaddpath([WEKA_HOME '\weka.jar']); fName = [WEKA_HOME '\data\iris.arff']; %## read file loader = weka.core.converters.ArffLoader(); loader.setFile( java.io.File(fName) ); D = loader.getDataSet(); D.setClassIndex( D.numAttributes()-1 ); %## dataset relationName = char(D.relationName); numAttr = D.numAttributes; numInst = D.numInstances; %## attributes %# attribute names attributeNames = arrayfun(@(k) char(D.attribute(k).name), 0:numAttr-1, 'Uni',false); %# attribute types types = {'numeric' 'nominal' 'string' 'date' 'relational'}; attributeTypes = arrayfun(@(k) D.attribute(k-1).type, 1:numAttr); attributeTypes = types(attributeTypes+1); %# nominal attribute values nominalValues = cell(numAttr,1); for i=1:numAttr if strcmpi(attributeTypes{i},'nominal') nominalValues{i} = arrayfun(@(k) char(D.attribute(i-1).value(k-1)), 1:D.attribute(i-1).numValues, 'Uni',false); end end %## instances data = zeros(numInst,numAttr); for i=1:numAttr data(:,i) = D.attributeToDoubleArray(i-1); end %## visualize data parallelcoords(data(:,1:end-1), ... 'Group',nominalValues{end}(data(:,end)+1), ... 'Labels',attributeNames(1:end-1)) title(relationName) 

parallel_coordinates

You can even use your functions directly from MATLAB. Example:

 %## classification classifier = weka.classifiers.trees.J48(); classifier.buildClassifier( D ); fprintf('Classifier: %s %s\n%s', ... char(classifier.getClass().getName()), ... char(weka.core.Utils.joinOptions(classifier.getOptions())), ... char(classifier.toString()) ) 

Output tree of decisions C4.5:

 Classifier: weka.classifiers.trees.J48 -C 0.25 -M 2 J48 pruned tree ------------------ petalwidth <= 0.6: Iris-setosa (50.0) petalwidth > 0.6 | petalwidth <= 1.7 | | petallength <= 4.9: Iris-versicolor (48.0/1.0) | | petallength > 4.9 | | | petalwidth <= 1.5: Iris-virginica (3.0) | | | petalwidth > 1.5: Iris-versicolor (3.0/1.0) | petalwidth > 1.7: Iris-virginica (46.0/1.0) Number of Leaves : 5 Size of the tree : 9 
+8
source

If you want to upload a file stored in the "arff" format to Matlab and do not need any other functions from Weka, simply delete the header part of your arff file (these attribute definitions) and save the file as csv format (you should replace the values classes with numerical equivalents), and then use the built-in "csvread" function for Matlab. Thus, there is no need to look for a third-party package.

+2
source
 M = importdata('filename.arff'); 

very slow for large files, but it works (tested in MATLAB 2010b)

+2
source

Searching the central file exchange MATLAB shows some possibilities. In particular, the results of Durga Lal Shrest and Gerald Augusto Corzo Perez look promising, although I have not tried.

0
source

If the above methods do not work, and header information is required, upload the arff file to weka, then select save as a parameter and save the data in csv file format.

0
source

Just delete all lines with β€œ@” at the beginning of arff, and then save them in .txt format, after which all you have to do is drag it to the workspace or even import it. he works for me every time greetings

0
source

Source: https://habr.com/ru/post/894356/


All Articles