Fast Treebank Part-of-Speech Tagger for Python NLTK
Overview
This data download is a pre-trained model for a Bayesian classifier. If you do not have experience with Python NLTK, you may not be interested in this data set.
A 99.3% accurate part-of-speech tagger trained on the treebank corpus. It is many times faster than the default NLTK tagger and is a fraction of the size (which means less loading time and lower memory requirements). It requires Python & NLTK 2.0 and is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-nc-sa/3.0/.
Application Gallery
Do you have an application, visualization or otherwise great use of this data?
Submit it now, and be featured here!
Infochimps Platform
Use this data on the Infochimps Big Data Platform to unlock:
- Advanced analytical capabilities
- Hosting for customer databases
- Access to tools such as Hadoop, Pig, and R
- …and more to come!
Learn More »
Tags
Stats
| Added by: | japerk | |
|---|---|---|
| Link: | ||
| Created: | 9 months ago | |
| Updated: | 20 days ago | |
Share
