Python splits string into quotes

Question

Python splits string into quotes

I am a python student. If I have lines of text in a file that looks like

"Y: \ DATA \ 00001 \ SERVER \ DATA.TXT" "V: \ DATA2 \ 00002 \ SERVER2 \ DATA2.TXT"

Is it possible to split lines around inverted commas? The only constant would be their position in the file relative to the data lines themselves. Data lines can vary from 10 to 100 + characters (they will be sub-network folders). I don’t see how I can use any other way so that these markers can separate, but my lack of knowledge in python makes this difficult. I tried

optfile=line.split("")

and other options, but keep getting the value error: empty seperator. I understand why it says that, I just don’t know how to change it. Any help, as always, is greatly appreciated.

Thank you very much

+6

python python-2.7

user2377057 May 17 '13 at 7:04

source share

9 answers

You must exit from:

 input.split("\"")

leads to

 ['\n', 'Y:\\DATA\x0001\\SERVER\\DATA.TXT', ' ', 'V:\\DATA2\x0002\\SERVER2\\DATA2.TXT', '\n']

To remove blank lines:

 [line for line in [line.strip() for line in input.split("\"")] if line]

leads to

 ['Y:\\DATA\x0001\\SERVER\\DATA.TXT', 'V:\\DATA2\x0002\\SERVER2\\DATA2.TXT']

+8

user1907906 May 17, '13 at 7:09

source share

No regex, no separation, just use csv.reader

 import csv sample_line = '10.0.0.1 foo "24/Sep/2015:01:08:16 +0800" www.google.com "GET /" -' def main(): for l in csv.reader([sample_line], delimiter=' ', quotechar='"'): print l

Output signal

 ['10.0.0.1', 'foo', '24/Sep/2015:01:08:16 +0800', 'www.google.com', 'GET /', '-']

+4

Mckelvin 25 sept. '15 at 4:02

source share

I will simply add that if you were dealing with strings that look as if they could be command line parameters, then you could use the shlex module :

 import shlex with open('somefile') as fin: for line in fin: print shlex.split(line)

Would give:

 ['Y:\\DATA\\00001\\SERVER\\DATA.TXT', 'V:\\DATA2\\00002\\SERVER2\\DATA2.TXT']

+3

Jon clements Sep 09 '13 at 22:26

source share

shlex can help you.

 import shlex my_string = '"Y:\DATA\00001\SERVER\DATA.TXT" "V:\DATA2\00002\SERVER2\DATA2.TXT"' shlex.split(my_string)

It will spit

 ['Y:\\DATA\x0001\\SERVER\\DATA.TXT', 'V:\\DATA2\x0002\\SERVER2\\DATA2.TXT']

Link: https://docs.python.org/2/library/shlex.html

+1

Kashif siddiqui Jun 18 '16 at 10:18

source share

I think you want to extract files that are separated by spaces. That is, you want to split the string into elements contained in quotes. I. with a line

 "FILE PATH" "FILE PATH 2"

Do you want to

 ["FILE PATH","FILE PATH 2"]

In this case:

 import re with open('file.txt') as f: for line in f: print(re.split(r'(?<=")\s(?=")',line))

With file.txt :

 "Y:\DATA\00001\SERVER\DATA MINER.TXT" "V:\DATA2\00002\SERVER2\DATA2.TXT"

Outputs:

 >>> ['"Y:\\DATA\\00001\\SERVER\\DATA MINER.TXT"', '"V:\\DATA2\\00002\\SERVER2\\DATA2.TXT"']

0

Hennyh May 17 '13 at 7:11

source share

That was my decision. It analyzes the most normal input in the same way as if it were passed directly to the command line.

 import re def simpleParse(input_): def reduce_(quotes): return '' if quotes.group(0) == '"' else '"' rex = r'("[^"]*"(?:\s|$)|[^\s]+)' return [re.sub(r'"{1,2}',reduce_,z.strip()) for z in re.findall(rex,input_)]

Use case: collecting a bunch of single-shot scripts into a launch utility without having to re-enter the command.

Edit: Got an OCD about the stupid way the command line handles crappy quotes and wrote below:

 import re tokens = list() reading = False qc = 0 lq = 0 begin = 0 for z in range(len(trial)): char = trial[z] if re.match(r'[^\s]', char): if not reading: reading = True begin = z if re.match(r'"', char): begin = z qc = 1 else: begin = z - 1 qc = 0 lc = begin else: if re.match(r'"', char): qc = qc + 1 lq = z elif reading and qc % 2 == 0: reading = False if lq == z - 1: tokens.append(trial[begin + 1: z - 1]) else: tokens.append(trial[begin + 1: z]) if reading: tokens.append(trial[begin + 1: len(trial) ]) tokens = [re.sub(r'"{1,2}',lambda y:'' if y.group(0) == '"' else '"', z) for z in tokens]

0

Redsplinter Sep 09 '13 at 17:59

source share

I know this was answered a million years ago, but this also works:

 input = '"Y:\DATA\00001\SERVER\DATA.TXT" "V:\DATA2\00002\SERVER2\DATA2.TXT"' input = input.replace('" "','"').split('"')[1:-1]

Should output it as a list containing:

 ['Y:\\DATA\x0001\\SERVER\\DATA.TXT', 'V:\\DATA2\x0002\\SERVER2\\DATA2.TXT']

0

D'arcy 15 sept. '16 at 10:18

source share

My Python question is - An error caused by space in argv Arument has been flagged as a duplicate of this. We have several Python books that return to Python 2.3. The oldest was mentioned using a list for argv, but without an example, so I changed things to: -

 repoCmd = ['Purchaser.py', 'task', repoTask, LastDataPath] SWCore.main(repoCmd)

and in SWCore: -

 sys.argv = args

The shlex module works, but I prefer that.

-1

Oldsteve 20 sept '17 at 13:59

source share

Thomas jung · Accepted Answer · 2013-05-17T07:09:33+0000

Searching for all regular expression matches will do this:

 input=r'"Y:\DATA\00001\SERVER\DATA.TXT" "V:\DATA2\00002\SERVER2\DATA2.TXT"' re.findall('".+?"', # or '"[^"]+"', input)

This will return a list of file names:

 ["Y:\DATA\00001\SERVER\DATA.TXT", "V:\DATA2\00002\SERVER2\DATA2.TXT"]

To get the file name without quotes, use:

 [f[1:-1] for f in re.findall('".+?"', input)]

or use re.finditer :

 [f.group(1) for f in re.finditer('"(.+?)"', input)]

Python splits string into quotes

More articles: