How to reinstall lxml?

I am using python 2.7.5 on mac 10.7.5, beautifulsoup 4.2.1. I am going to parse an XML page using the lxml library as described in the beautifulsoup tutorial. However, when I run my code, it shows

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml,xml. Do you need to install a parser library? 

I am sure that I have already installed lxml in all ways: easy_install, pip, port, etc. I tried adding a line to my code to see if lxml is installed or not:

 import lxml 

Then python can simply successfully pass this code and display the previous error message again, occurring on the same line.

So, I am sure that lxml has been installed but not installed correctly. So I decided to uninstall lxml and then reinstall using the β€œcorrect” method. But when I type

 easy_install -m lxml 

He shows:

 Searching for lxml Best match: lxml 3.2.1 Processing lxml-3.2.1-py2.7-macosx-10.6-intel.egg Using /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/lxml- 3.2.1-py2.7-macosx-10.6-intel.egg Because this distribution was installed --multi-version, before you can import modules from this package in an application, you will need to 'import pkg_resources' and then use a 'require()' call similar to one of these examples, in order to select the desired version: pkg_resources.require("lxml") # latest installed version pkg_resources.require("lxml==3.2.1") # this exact version pkg_resources.require("lxml>=3.2.1") # this version or higher Processing dependencies for lxml Finished processing dependencies for lxml 

So, I do not know how to continue uninstalling ...

I looked through a lot of posts about this issue on Google, but still can not find any useful information.

Here is my code:

 import mechanize from bs4 import BeautifulSoup import lxml class count: def __init__(self,protein): self.proteinCode = protein self.br = mechanize.Browser() def first_search(self): #Test 0 soup = BeautifulSoup(self.br.open("http://www.ncbi.nlm.nih.gov/protein/21225921?report=genbank&log$=prottop&blast_rank=1&RID=YGJHMSET015"), ['lxml','xml']) return if __name__=='__main__': proteinCode = sys.argv[1] gogogo = count(proteinCode) 

I want to know:

  • How to remove lxml?
  • How to install lxml "correctly"? How to know that it is installed correctly?
+13
python easy-install lxml beautifulsoup
Jul 20 '13 at 21:11
source share
4 answers

I am using BeautifulSoup 4.3.2 and OS X 10.6.8. I also have a problem with improperly installed lxml . Here are some things I learned:

First of all, check out this related question: Remote MacPorts, now Python is broken

Now, to check which builds for BeautifulSoup 4 are installed, try

 >>> import bs4 >>> bs4.builder.builder_registry.builders 

If you do not see your favorite builder, it is not installed, and you will see an error as indicated above ("Could not find the tree constructor ...").

Also, just because you can import lxml does not mean that everything is perfect.

Try

 >>> import lxml >>> import lxml.etree 

To understand what is going on, go to bs4 installation and open the egg ( tar -xvzf ). Check out the bs4.builder modules. In it, you will see files such as _lxml.py and _html5lib.py . So you can also try

 >>> import bs4.builder.htmlparser >>> import bs4.builder._lxml >>> import bs4.builder._html5lib 

If there is a problem, you will see why the student module cannot be loaded. You can notice how at the end of builder/__init__.py it loads all of these modules and ignores everything that has not been loaded:

 # Builders are registered in reverse order of priority, so that custom # builder registrations will take precedence. In general, we want lxml # to take precedence over html5lib, because it faster. And we only # want to use HTMLParser as a last result. from . import _htmlparser register_treebuilders_from(_htmlparser) try: from . import _html5lib register_treebuilders_from(_html5lib) except ImportError: # They don't have html5lib installed. pass try: from . import _lxml register_treebuilders_from(_lxml) except ImportError: # They don't have lxml installed. pass 
+11
Oct 27 '13 at 1:12
source share

If you are using Python2.7 on Ubuntu / Debian, this worked for me:

 $ sudo apt-get build-dep python-lxml $ sudo pip install lxml 

Test it like this:

 mona@pascal:~/computer_vision/image_retrieval$ python Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import lxml 
+3
Oct 11 '16 at 21:21
source share

FWIW, I encountered a similar problem (python 3.6, os x 10.12.6) and was able to solve it simply by running (the first command just means that I was working in a virtual virtual network):

 $ source activate ml-general $ pip uninstall lxml $ pip install lxml 

At first I tried more complicated things, because BeautifulSoup worked correctly with the identical team through Jupyter + iPython, but not through the PyCharm terminal in the same virtual space. Just reinstalling lxml as described above solved the problem.

+1
Aug 14 '17 at 6:33
source share

apt-get on Debian / Ubuntu: sudo apt-get install python3-lxml For macOS-X, macport lxml is available. Try something like sudo port install py27-lxml

http://lxml.de/installation.html may be helpful.

0
Jun 09 '16 at 8:15
source share



All Articles