Natural Language Processing Resources

From AGIRI.org

Jump to: navigation, search

Natural language procesing is curently understood to require several components. These include dictionaries (or NLP databases or netoworks), parsers, and ontologies. The last component, ontologies, are used to provide a semantic foundation for disambiguating ambiguous sentances. The distinction between a "dictionary" used in the parsing, and an ontology, can be blurry.

Contents

NLP databases

Some important NLP databases are:

  • WordNet
  • FrameNet
  • Something called Preposition WordNet was constructed at Novamente LLC; if you want to use it, contact ben@goertzel.org

Open source parsers

  • NLTK, the Natural Language Toolkit. Documentation includes a book, multiple articles. Features integration into WordNet. Written in python.
  • Link Grammar Parser, from Carnegie-Mellon. A parser for the English language, based on "link grammar", a novel theory of natural language syntax. Written in C, with a BSD-like license that is compatible wth the GPL.
  • GATE, General Architecture for Text Engineering. Old, big, well-documented, with many features and bits and pieces, including dialog processing and NL generation. Written in Java, GPL license.

Commercial and closed source parsers

Commercial or closed-source research parsers include:

  • Cyc NL subsystem, from Cyc corporation.
  • MiniPar, a parser created by Dekang Lin. Closed-source, free for non-commercial use.

Mind Ontology Links

Personal tools