|
Linguistics Resources
James Madison University
|
There are a number of corpora for linguistic analysis and quite a bit of information about linguistics available online (not to mention all the resources physically in our library :) ). Here are some links to get you started:
-
Corpora and Related Sources (Note that some of these will require that you register, and you may need to spend some time figuring out how the data are organized. In some cases, you may need to download and learn to use related software tools as well. Make sure to follow all required usage guidelines and restrictions and to give proper citations of your use of any of these databases.) :
- The Breeze, a searchable format available to the JMU community through The JMU Scholarly Commons.
- Corpus of Contemporary American English Big database for word searches, synonyms, collocations, etc. in both written American English in several broad genres and spoken American English (mostly television transcripts).
- Corpus of Historical American English Big database for search for changing patterns of usage for words/expressions in American English, etc.
- Appalachian English A site maintained at the University of South Carolina. It includes recordings and transcripts as well as other information.
- The Freiburg Corpus of English Dialects (FRED) Covers a number of major dialects in the British Isles, includes maps. Links to a manual and information appear on this page. Small voice and text samples of the full corpus can be found by following the "Projekt-FRED" link near the bottom of this page, and from there, you can also link to their 'Sampler' (FRED-S), which has many more examples (audio and transcripts) from five regions availlable for free download. The original project page has a link to the manual for this sampler corpus as well.
- The Google Books Ngram Corpus, which includes a collection of "ngrams" that occur more than 40 times in millions of books written over five centuries (roughly 6% of all books ever published, in multiple languages). You can also graph a pattern of usage over time for a phrase in the corpus HERE. Additional search options for this corpus are also available online through BYU, the source for the American English search engines above.
- The Santa Barbara Corpus Of Spoken American English
- TALKBANK, which includes a variety of browsable or downloadable databases, including Child language, conversational language, aphasia and head trauma related data, bilingual and SLA data, etc. You may need to download necessary software.)
- The MacArthur-Bates Communicative Development Inventories, which contain collected from parents about infant lexical comprehension and usage and toddler lexical usage in American English and several other languages. Look under the Lexical Norms link on the bottom right for access to their Cross Linguistics Lexical Norms (CLEX) data and also to Wordbank, their open database on children's vocabulary growth in many different languages.
- The British National Corpus, late 20th century British English, written and spoken.
- WordNet, a database of English words with conceptual-semantic and lexical relations.
- There are quite a few websites with literary resources, some of which are searchable, like the .pdf files available at The Pulp Magazine Project and the online editions at
Project Gutenberg. Many book readers, like Kindle, also allow you to search and bookmark keywords and phrases in e-books.
- The Linguistic Data Consortium Online. JMU does not currently have a membership in LDC, but their online resources, including searches on The Brown Corpus of Standard American English and on Switchboard (American English telephone conversation transcripts) are available at no cost online to individuals who register with them.
- The Ohio State University Computational Linguistics and Language Technology Portal, which includes other corpora links as well as links to tools for linguistic analysis, such as Part-of-Speech tagger, parsers, etc.
- The Conceptual Metaphor Home Page
- Research Databases:
- Linguistics and Language Behavior Abstracts(LLBA), also available by finding the LLBA link under research databases on the Carrier Library website.
- MLA International Bibliography, also available by finding the MLA link under research databases on the Carrieer Library website.
- LINGUIST List
- Linguistic Society of America
- Association for Computational Linguistics
- Cognitive Science Society
- Online Phonetics Resources Note that phonetics transcriptions and such on various sites may not be completely consistent with ones in your textbook and therefore may be not serve as useful study materials for course exams.
- Language Log
- Chomskybot
- Simulated Conversations (ELIZA)
- Rose (a much more recent "chatterbot" example)
JMU Writing Center Online Writing Tips and Resources