Removing special characters with Ruby but not spaces

I searched here for a while and did not quite understand what I needed. I am learning Ruby (1.9) and trying to do something basic with a text file. I am trying to use RegEx to remove non-letters and spaces that are ONLY at the beginning of a line, ignoring spaces between tokens (I try to count words in a file, so when I want spaces between words to remain).

Example:

555 r6ub6y i7s e7a0sy... w1o2w4. 

To change to:

 ruby is easy... wow. 

What I have so far used on the command line to test ruby rubyfile.rb < test.txt :

 $stdin.each do |line| line.chomp!.downcase! line.gsub!(/[^a-zA-Z]/, "") #this takes away my spaces! puts line end 
+5
source share
2 answers
 [^a-zA-Z. ] 

add a space.

+9
source

Since right now you are only specifying the removal of numbers, this will work as a single line.

 "555 r6ub6y i7s e7a0sy... w1o2w4.".gsub(/\d/,'').strip #=>"ruby is easy... wow." 

It basically talks about removing all numbers and spaces of the leading / trailing.

Now your regular expression says delete everything except uppercase and lowercase letters. You do not know what other types of characters you want to delete, but something like this can work for you if you want only spaces and periods of upper and lower case.

 "555 r6ub6y i7s e7a0sy... w1o2w4.".gsub(/[^a-zA-Z\s.]/,'').strip #=>"ruby is easy... wow." 

Also, when including spaces in a regex, I always prefer to use \s instead of the intended space, such as [ ] , because I feel like it adds readability, because [a-zA-Z ] may be a typo and should not contain spaces, but [a-zA-Z\s] very definitely saying that I want spaces.

Want to know more about Regex Rubular validation , it is a regular expression regulator for Ruby, and I use it all the time. The only thing that doesn't really discuss is greedy and non-greedy capture groups, but I have a feeling that you don't need to worry about it right now.

+4
source

Source: https://habr.com/ru/post/1202088/


All Articles