Remove all non-alphabetic, non-numeric characters from a string?

If I wanted to remove such things as:.!, '"^ - # from an array of strings, how would I do this, preserving all the alphanumeric characters.

Valid alphabetic characters must also contain letters with diacritics, including à or ç.

+6
source share
5 answers

You must use a regular expression with the correct character property. In this case, you can invert the Alnum class (alphabet and numeric character):

 "◊¡ Marc-André !◊".gsub(/\p{^Alnum}/, '') # => "MarcAndré" 

For more complex cases, let's say you also wanted punctuation, you can also create a set of valid characters, for example:

 "◊¡ Marc-André !◊".gsub(/[^\p{Alnum}\p{Punct}]/, '') # => "¡MarcAndré!" 

For all character properties, you can refer to the doc .

+17
source
 string.gsub(/[^[:alnum:]]/, "") 
+3
source

The following will work for array :

 z = ['asfdå', 'b12398!', 'c98347'] z.each { |s| s.gsub! /[^[:alnum:]]/, '' } puts z.inspect 

I borrowed Jeremy by suggesting regex .

+3
source

You may consider regex.

http://www.regular-expressions.info/ruby.html

I assume you are using ruby ​​since you noted this in your post. You can go through the array, go through the test using a regular expression, and if it passes, delete / save it based on the regular expression that you are using.

The regular expression you can use might look something like this:

 [^.!,^-#] 

This will tell you if its not one of the characters inside the brackets. However, I suggest you look for regular expressions, you can find a better solution as soon as you know their syntax and usage.

+1
source

If you really have an array (as you state) and it is an array of strings (I assume), for example

 foo = [ "hello", "42 cats!", "yöwza" ] 

then I can imagine that you either want to update each row in the array with a new value, or you want the modified array to contain only certain rows.

If the first (you want to "clear" each line with an array), you can do one of the following:

 foo.each{ |s| s.gsub! /\p{^Alnum}/, '' } # Change every string in place… bar = foo.map{ |s| s.gsub /\p{^Alnum}/, '' } # …or make an array of new strings #=> [ "hello", "42cats", "yöwza" ] 

If the latter (you want to select a subset of strings where each matches your criteria for storing only alphanumeric characters), you can use one of them:

 # Select only those strings that contain ONLY alphanumerics bar = foo.select{ |s| s =~ /\A\p{Alnum}+\z/ } #=> [ "hello", "yöwza" ] # Shorthand method for the same thing bar = foo.grep /\A\p{Alnum}+\z/ #=> [ "hello", "yöwza" ] 

In Ruby, regular expressions of the form /\A………\z/ require matching the entire string, since \A binds the regular expression to the beginning of the string and binds \z to the end.

+1
source

Source: https://habr.com/ru/post/908240/


All Articles