Easy URL cleaning

I am trying to do basic URL cleaning so that

www.google.com www.google.com/ http://google.com http://google.com/ https://google.com https://google.com/ 

are replaced by http://www.google.com (or https://www.google.com when https:// is at the beginning).

Basically, I would like to check if http/https at the beginning and / at the end in one regex.

I tried something like this:

"https://google.com".match(/^(http:\/\/|https:\/\/)(.*)(\/)*$/) in this case I get: => #<MatchData "https://google.com" 1:"https://" 2:"google.com" 3:nil> which is good.

Sorry for:

"https://google.com/".match(/^(http:\/\/|https:\/\/)(.*)(\/)*$/) I get: => #<MatchData "https://google.com/" 1:"https://" 2:"google.com/" 3:nil> and would like to have 2:"google.com" 3:"/"

Any idea how to do this?

+4
source share
1 answer

This is obvious if you notice an error;)

Have you tried:

 ^(http:\/\/|https:\/\/)(.*)(\/)*$ 

The answer is to use:

 ^(http:\/\/|https:\/\/)(.*?)(\/)*$ 

This makes the operator β€œinanimate,” so the tail slash is not absorbed. ” Operator.

EDIT:

In fact, you really should use:

 ^(http:\/\/|https:\/\/)?(www\.)?(.*?)(\/)*$ 

That way, you will also match your first two examples, which do not have "http (s): //" in them. You also share the value / existence of the www part. In action: http://www.rubular.com/r/VUoIUqCzzX

EDIT2:

I was bored and would like to do this: P

Here you go:

 ^(https?:\/\/)?(?:www\.)?(.*?)\/?$ 

Now all you have to do is replace your site with the first match (or "http: //" if nil), then "www.", Then the second match.

In action: http://www.rubular.com/r/YLeO5cXcck

(18 months later) EDIT:

Check out my awesome ruby ​​stone to help solve your problems!

https://github.com/tom-lord/regexp-examples

 /(https?:\/\/)?(?:www\.)?google\.com\/?/.examples # => ["google.com", "google.com/", "www.google.com", "www.google.com/", "http://google.com", "http://google.com/", "http://www.google.com", "http://www.google.com/", "https://google.com", "https://google.com/", "https://www.google.com", "https://www.google.com/"] /(https?:\/\/)?(?:www\.)?google\.com\/?/.examples.map(&:subgroups) # => [[], [], [], [], ["http://"], ["http://"], ["http://"], ["http://"], ["https://"], ["https://"], ["https://"], ["https://"]] 
+6
source

Source: https://habr.com/ru/post/1488447/


All Articles