Encyclopedia > Corpus linguistics

  Article Content

Corpus linguistics

Corpus Linguistics is the study of language as expressed in samples (corpora) or "real world" text. The approach runs counter to Noam Chomsky's view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting. Corpus Linguistics does away with Chomsky's competence/performance split, viewing that we can only ever reliably analyse language if the researcher does not interfere.

In some areas there is an overlap with computational linguistics, as the latter moves towards language processing applications. This means dealing with real input data, where descriptions based on a linguist's intuition are not usually helpful.

The COBUILD dictionaries, designed for users learning English as a foreign language, are based on corpus linguistics; definitions are based on how words are used rather than on historical definitions of their meaning.

Some keywords:

Some links:

  • The Centre for Corpus Linguistics at Birmingham University:

All Wikipedia text is available under the terms of the GNU Free Documentation License

  Search Encyclopedia

Search over one million articles, find something about almost anything!
  Featured Article
Lake Ronkonkoma, New York

... are 6,700 households out of which 35.6% have children under the age of 18 living with them, 59.8% are married couples living together, 10.9% have a female householder ...

This page was created in 66.9 ms