true

TypeError: object "NoneType" not called by Python with BeautifulSoup XML

I have the following XML file:

<user-login-permission>true</user-login-permission> <total-matched-record-number>15000</total-matched-record-number> <total-returned-record-number>15000</total-returned-record-number> <active-user-records> <active-user-record> <active-user-name>username</active-user-name> <authentication-realm>realm</authentication-realm> <user-roles>Role</user-roles> <user-sign-in-time>date</user-sign-in-time> <events>0</events> <agent-type>text</agent-type> <login-node>node</login-node> </active-user-record> 

There are many entries. I am trying to get the values ​​from tags and save them in another text file using the following code:

 soup = BeautifulSoup(open("path/to/xmlfile"), features="xml") with open('path/to/outputfile', 'a') as f: for i in range(len(soup.findall('active-user-name'))): f.write ('%s\t%s\t%s\t%s\n' % (soup.findall('active-user-name')[i].text, soup.findall('authentication-realm')[i].text, soup.findall('user-roles')[i].text, soup.findall('login-node')[i].text)) 

I get a TypeError: error object "NoneType" not called by Python with BeautifulSoup XML for the string: for I'm in the range (len (soup.findall ("active-user-name")):

Any idea what could be causing this?

Thanks!

+4
source share
3 answers

There are a number of problems that need to be solved with this: firstly, the XML file you provided is invalid XML - the root element is required.

Try something like XML:

 <root> <user-login-permission>true</user-login-permission> <total-matched-record-number>15000</total-matched-record-number> <total-returned-record-number>15000</total-returned-record-number> <active-user-records> <active-user-record> <active-user-name>username</active-user-name> <authentication-realm>realm</authentication-realm> <user-roles>Role</user-roles> <user-sign-in-time>date</user-sign-in-time> <events>0</events> <agent-type>text</agent-type> <login-node>node</login-node> </active-user-record> </active-user-records> </root> 

Now on to python. Firstly, there is no findall method, it is either findall or find_all . findall and find_all equivalent as described here

Further, I would suggest changing the code so that you do not use the find_all method quite often - using find instead will improve efficiency, especially for large XML files. In addition, the code below is easier to read and debug:

 from bs4 import BeautifulSoup xml_file = open('./path_to_file.xml', 'r') soup = BeautifulSoup(xml_file, "xml") with open('./path_to_output_f.txt', 'a') as f: for s in soup.findAll('active-user-record'): username = s.find('active-user-name').text auth = s.find('authentication-realm').text role = s.find('user-roles').text node = s.find('login-node').text f.write("{}\t{}\t{}\t{}\n".format(username, auth, role, node)) 

Hope this helps. Let me know if you need more help!

+5
source

The solution is simple: do not use the findall method - use find_all .

Why? Since the findall method findall generally absent, there are findall and find_all that are equivalent. See docs for more details.

Although, I agree, the error message is getting confused.

Hope this helps.

+1
source

The fix for my version of this problem is to force the BeautifulSoup instance to a type string. You do the following: https://groups.google.com/forum/#!topic/comp.lang.python/ymrea29fMFI

you are using the following pythonic: From the python manual

str ([object])

Returns a string containing a beautifully printed representation of the object. For strings, this returns a string. The difference with the view (object) is that str (object) does not always try to return a string acceptable to eval (); its purpose is to return the print string. If no argument is given, returns an empty string,

0
source

Source: https://habr.com/ru/post/1494658/


All Articles