You can easily use regexp here - it works for cells:
matching_lines = s{~cellfun('isempty', regexp(s, '^id: GO'))} ans = id: GO:0008150 ans = id: GO:0016740
retrieves all lines starting with id: GO . Only calling cellfun gives you the vector 0/1, where 1 means the string in s matches your query.
A similar line finds those that contain is_a: GO: Cutting unwanted characters from strings can also be done with regexp .
Extracting parts of the strings can be done using the 'tokens' regexp parameter:
tok = regexp(s, '^id: (GO.*)', 'tokens'); idx = ~cellfun('isempty', tok); v = cellfun(@(x)x{1}, {tok{idx}}); sprintf('%s ', v{:}) ans = GO:0008150 GO:0016740
source share