Dictionary iteration in python and removing space

I work with the Scrapping Scrapy web environment and I am a little noob when it comes to python. Therefore, I wonder how I can sort through all the scraper elements that seem to be in the dictionary and separate the spaces from each of them.

Here is the code I played with in my pipeline.

for info in item: info[info].lstrip() 

But this code does not work, because I can not select elements separately. So I tried to do this:

 for key, value item.items(): value[1].lstrip() 

This second method works to some extent, but the problem is that I have no idea how to iterate over all the values โ€‹โ€‹then.

I know this is probably such an easy solution, but I cannot find it. Any help would be greatly appreciated. :)

+4
source share
6 answers

This is not a direct answer to the question, but I would suggest you look at Item Loaders and I / O processors. Here you can take care of your cleaning.

In the example in which each record will be recorded, will be:

 class ItemLoader(ItemLoader): default_output_processor = MapCompose(unicode.strip) 
+1
source

In the understanding of the dictionary (available in Python> = 2.7):

 clean_d = { k:v.strip() for k, v in d.iteritems()} 
+14
source

What you should note is that lstrip() returns a copy of the string, rather than modifying the object. To really update the dictionary, you need to assign a split value to the item.

For instance:

 for k, v in your_dict.iteritems(): your_dict[k] = v.lstrip() 

Note the use of .iteritems() , which returns an iterator instead of a list of key value pairs. This makes it somewhat more efficient.

I must add that in Python3, .item() was modified to return โ€œviewsโ€ , and therefore .iteritems() not required.

+2
source

Try

 for k,v in item.items(): item[k] = v.replace(' ', '') 

or in the complex form proposed by monkut:

 newDic = {k,v.replace(' ','') for k,v in item.items()} 
+1
source

Although @zquare had a better answer to this question, I feel like I need to call back using the Pythonic method, which will also take into account dictionary values โ€‹โ€‹that are not strings. This is not a recursive look at you, since it only works with one-dimensional dictionary objects.

 d.update({k: v.lstrip() for k, v in d.items() if isinstance(v, str) and v.startswith(' ')}) 

This updates the original dictionary value if that value is a string and starts with a space.

UPDATE: If you want to use regular expressions and avoid using start and end elements. You can use this:

 import re rex = re.compile(r'^\s|\s$') d.update({k: v.strip() for k, v in d.items() if isinstance(v, str) and rex.search(v)}) 

This version is broken if the value has a leading or trailing space character.

0
source

I am using the following. You can pass any object as an argument, including a string, list, or dictionary.

 # strip any type of object def strip_all(x): if isinstance(x, str): # if using python2 replace str with basestring to include unicode type x = x.strip() elif isinstance(x, list): x = [strip_all(v) for v in x] elif isinstance(x, dict): for k, v in x.iteritems(): x.pop(k) # also strip keys x[ strip_all(k) ] = strip_all(v) return x 
0
source

Source: https://habr.com/ru/post/1391605/


All Articles