Resources
-
Part-of-speech taggers
- Adwait's MXPOST (adwaittools), MXPOST extras for English (mxpost-extras)
- Libin's bidirectional POS tagger (bpos)
- Stanford POS tagger (stanford-tagger)
-
Parsers
- Dan Bikel's implementation of Mike Collins' parsing model (dbparser)
- Libin Shen's incremental LTAG-Spinal parser (spinc)
- Libin Shen's bidirectional LTAG-Spinal parser (binc)
- Ryan MacDonald's MSTParser (mstparser)
- Stanford Parser (stanford-parser)
- Dan Bikel's Switchboard parallelization framework (sb)
- XTag Tools (xtag)
-
Information Extraction
- BioTagger (Biotagger)
- Stanford Named Entity Recognizer (stanford-ner)
-
Tree Search and Manipulation
- Tregex and TSurgeon (stanford-tregex)
-
Segmentation
- Stanford Chinese Word Segmenter (stanford-chinese-segmenter)
-
Machine Learning
- Structlearn (structlearn)
- MALLET (mallet)
-
Discourse
- Discourse Connectives Tagger (addDiscourse)
- Penn Treebank (wsj, brown)
- LTAG-Spinal Treebank (../../tools/move_to_corpora/ltagtb)
- Penn Discourse Treebank
- Penn Arabic Treebank
- Penn Chinese Treebank (ctb)
- Prague Dependency Treebank (Czech)
- Prague Czech-English Treebank
- Prague Arabic Dependency Treebank
- CHILDES
- New York Times Annotated Corpus
- LTAG-Spinal Java API (spinalapi.jar)
- XTAG Grammar (tools/xtag/english)
- cmudict pronunciation dictionary (cmudict.0.6)
- Image spam (spam_images)
- Multidomain Sentiment (sentiment)