How to parse buffer correctly in elisp?

What is the proper way to parse a buffer to store its contents and reuse it?

Let's say I got this buffer:

always|five|words|by|line not|always|the|same|words sometimes|no|lines|at|all but|only|five|no|more/less 

What would be the best approach to building a list of characters found in strings (and an error is good if they are not found)?

There is a buffer, I can visit it, get its contents this way

 (message "Buffer content : %s" (buffer-substring (point-min) (point-max))) 

after I killed him cleanly, but for some reason I canโ€™t build an object (a list of โ€œlinesโ€ of lists of โ€œwordsโ€) that would allow me to do this:

 (list-length lines) ==> 4 (car (nthcdr 3 lines)) ==> sometimes 

Can a soul mate point me to the light? Thank you for your patience, Lisp Elders.

+4
source share
3 answers

You can also use the built-in split-string function, similar to split in Perl and other languages:

 (defun buffer-to-list-of-lists (buf) (with-current-buffer buf (save-excursion (goto-char (point-min)) (let ((lines '())) (while (not (eobp)) (push (split-string (buffer-substring (point) (point-at-eol)) "|") lines) (beginning-of-line 2)) (nreverse lines))))) 

Then with your sample text in a buffer named temp , (buffer-to-list-of-lists "temp") returns the value

 (("always" "five" "words" "by" "line") ("not" "always" "the" "same" "words") ("sometimes" "no" "lines" "at" "all") ("but" "only" "five" "no" "more/less")) 

This will work with lines with any number | -separated words that may or may not be better for your application. Change buffer-substring to buffer-substring-no-properties if you do not want the lines in the list of lists to include font information and other properties that they had in the original buffer.

As soon as you earn as you want, you will also need to change the usage example (list-length '(lines)) to (list-length lines) . In your current form, you request the length of a permanent singleton list containing only the lines character.

+7
source

Here is a simple regexp parser that can be useful as a start to achieve what you want:

 (let (lines) (beginning-of-line) (while (not (eobp)) (push (if (looking-at "\\([^|\n]+\\)|\\([^|\n]+\\)|\\([^|\n]+\\)|\\([^|\n]+\\)|\\([^|\n]+\\)") (list (match-string-no-properties 1) (match-string-no-properties 2) (match-string-no-properties 3) (match-string-no-properties 4) (match-string-no-properties 5)) 'no-match) lines) (forward-line 1)) (setq lines (nreverse lines)) (print lines)) 
+2
source

Suppose the text variable contains the contents of your buffer as a string, for Jon O's answer . Then use the dash.el API list and s.el API functions:

 (--map (s-split "|" it) (s-lines text)) 

--map is an anaphoric version of -map , it provides a temporary variable it , so you do not need to pass an anonymous function. s-split - a simple wrapper around split-string , s-lines divides the string into new lines.

+2
source

Source: https://habr.com/ru/post/1403453/


All Articles