Extract line betwen two lines in pandas

I have a text column that looks like this:

http://start.blabla.com/landing/fb603?&mkw...

I want to extract "start.blabla.com" which is always between:

http://

and:

/landing/

namely:

start.blabla.com

I do:

df.col.str.extract('http://*?\/landing')

But that will not work. What am I doing wrong?

+4
source share
2 answers

To answer a question that you did not ask, if you want to extract several parts of a row into separate columns, you should do it as follows:

df.col.str.extract('http://(?P<Site>.*?)/landing/(?P<RestUrl>.*)')

You will get something line by line:

               Site        RestUrl
0  start.blabla.com  fb603?&mkw...

, ( ), regex101. , .

+1

http:/, 0+ / , /landing.

( extract .) http:// /, 1 .

http://([^/]+)/landing
       ^^^^^^^

[^/]+ , 1 + , /.

regex

+7

Source: https://habr.com/ru/post/1663740/


All Articles