So, now I am training the Hidden Markov model on a set of surgical data, for example:
nltkTrainer = nltk.tag.hmm.HiddenMarkovModelTrainer(range(15),range(90)) model = nltkTrainer.train_unsupervised(data, max_iterations=3)
If useful, "model" is set to "HiddenMarkovModelTagger 15 states and 90 output characters"
However, it takes almost an hour to fully prepare on my machine. I want to be able to serialize the nltk model output model for loading and saving between sessions. I read, and everyone seems to be using Python built into pickle, which works great and works great for well-known data types. I can even pickle my trained model variable using this code:
f = open('my_classifier.pickle', 'wb') pickle.dump(model, f) f.close()
But when I try to load the pickled file, I get an error message:
/usr/local/lib/python2.7/dist-packages/nltk/probability.pyc in __init__(self, probdist_dict) 1971 """ 1972 defaultdict.__init__(self, DictionaryProbDist) -> 1973 self.update(probdist_dict) 1974 1975 ##////////////////////////////////////////////////////// TypeError: 'type' object is not iterable
Has anyone found a way around this? Is this a problem with NLTK?
source share