Chinese Part-of-Speech Tagger for Python NLTK
Overview
This data download is a pre-trained model for a Bayesian classifier. If you do not have experience with Python NLTK, you may not be interested in this data set.
A 98.3% accurate Chinese part-of-speech tagger trained on the sinica_treebank corpus. It requires Python & NLTK 2.0 and is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License: http://creativecommons.org/licenses/by-nc-sa/2.5/
Application Gallery
Do you have an application, visualization or otherwise great use of this data?
Submit it now, and be featured here!
Infochimps Platform
Use this data on the Infochimps Big Data Platform to unlock:
- Advanced analytical capabilities
- Hosting for customer databases
- Access to tools such as Hadoop, Pig, and R
- …and more to come!
Learn More »
Tags
Stats
| Added by: | japerk | |
|---|---|---|
| Link: | ||
| Created: | 9 months ago | |
| Updated: | 27 days ago | |
Share
