In addition to the command line option already mentioned, you can programmatically set the NLTK data in a Python script by adding the argument to the download() function.
See the text help(nltk.download) , in particular:
Individual packages can be downloaded by calling the ``download()`` function with a single argument, giving the package identifier for the package that should be downloaded: >>> download('treebank')
I can confirm that this works to load one package at a time or when passing list or tuple .
>>> import nltk >>> nltk.download('wordnet') [nltk_data] Downloading package 'wordnet' to [nltk_data] C:\Users\_my-username_\AppData\Roaming\nltk_data... [nltk_data] Unzipping corpora\wordnet.zip. True
You can also try downloading the downloaded package without any problems:
>>> nltk.download('wordnet') [nltk_data] Downloading package 'wordnet' to [nltk_data] C:\Users\_my-username_\AppData\Roaming\nltk_data... [nltk_data] Package wordnet is already up-to-date! True
A function also appears that returns a boolean value that you can use to find out if the download succeeded:
>>> nltk.download('not-a-real-name') [nltk_data] Error loading not-a-real-name: Package 'not-a-real-name' [nltk_data] not found in index False
Wesley Baugh Feb 15 '13 at 0:37 2013-02-15 00:37
source share