Extract specific data from a text file

I have a txt file that appears in notepad ++ as follows:

/a/apple 1
/b/bat 10
/c/cat 22
/d/dog 33
/h/human/female 34

Now I want to extract everything after the second slash before the numbers at the end. Therefore I want:

out = {'apple'; 'bat'; 'cat'; 'dog'; 'human/female'}

I wrote this code:

file= fopen('file.txt');
out=  textscan(file,'%s','Delimiter','\n');
fclose(file);

He gives:

out =
   {365×1 cell}

out{1} = 

    '/a/apple 1'
    '/b/bat 10'
    '/c/cat 22'
    '/d/dog 33'
    '/h/human/female 34'

How can I get the required output from a text file (if possible, if possible)? Or any regular expression, if you directly obtain the desired result, is impossible?

+4
source share
3 answers

You can get the desired result directly from textscanwithout further processing:

file = fopen('file.txt');
out = textscan(file, '/%c/%s %d');
fclose(file);
out = out{2}

out =

  5×1 cell array

    'apple'
    'bat'
    'cat'
    'dog'
    'human/female'

, . (%s). , , , ( %d).

+4

, , , , . regexp MATLAB, :

% Your code
file= fopen('file.txt');
out =  textscan(file,'%s','Delimiter','\n');
fclose(file);

% Proposed changes
out = regexp(out{1}, '/\w*/(.+)\s', 'tokens', 'once');
out = [out{:}].';

, textscan , , regexp.   , , :

  • / -

  • \w*/ - , - , . , . , -.

  • (.+) - , (. ). , , -, - , . , ( ).

  • \s -

, , . , (.+) . , .

() 3 , 'tokens' regexp , . 'once' . , , , . . , .

, :

>> out

out =

  5×1 cell array

    'apple'
    'bat'
    'cat'
    'dog'
    'human/female'

, , , , . , cellfun regexp .

+3

.

file = fopen('file.txt');
out = textscan(file, '%s', 'Delimiter', '\n');
parsed = cellfun(@(x) textscan(x, '/%c/%s %d'), out{1}, 'uniformoutput', false);
parsed = cellfun(@(x) x{2}, parsed, 'uniformoutput', false);
fclose(file);
+1

Source: https://habr.com/ru/post/1683773/


All Articles