Lund Corpora

The Barack Obama Corpus
The Barack Obama Corpus (BOC) consists of 6,215,948 words (tokens), which are sourced from nearly 3,500 different texts, dating from January 2009 to January 2016. The texts, all taken from the White House Archives, comprise all speeches held by Barack Obama in his official capacity as 44th President of the United States of America. The earliest speech in the BOC is President Obama’s inauguration speech and the last is his final State of the Union speech (January 2016). In total, the corpus includes 34,967 word types, which leads to a type/token-ratio of 0.56. The files, which display the original titles given to them by the White House, have been tagged for genre, audience type, date and location of delivery, and principal topics. The genres include remarks, addresses, statements, press conferences, debates and question-answer sessions, while the audience types have been separated in three: general public, specialized audience and press. The locations distinguish between the United States and abroad (Germany, UK, Indonesia etc.). Topics include a six-way distinction into political issues (e.g. fiscal household), social issues (e.g. health care), humanitarian issues, environment, representation (e.g. ceremonial duties) and campaign speeches., How to cite this resource: Riesner, Katherina (2017). The Barack Obama Corpus [Data set]., the_barack_obama_corpus_information.txt
Eline Visser
Corpora of Papuan and Austronesian languages in eastern Indonesia: Geser-Gorom (ISO639-3:ges), Kalamang (ISO639-3:kgv), Uruangnirin (ISO639-3:urn) and Yamdena (ISO639-3:jmd).
LACOLA: Language, Cognition, and Landscape
References to Environs are Coordinated to be Heard and Seen. The REaCHeS project examines indicating strategies in speakers of Eastern Chatino (EC), a lesser-studied and typologically unusual language used in Oaxaca, Mexico.
The datas were recorded on computer with other students present
This is a research corpus as well as a public corpus of Swedish dialects. It was recorded 1998-2001 by the Swedia 2000 project and described in IMDI within the ECHO project.
Tactile Reading
Participants reading a short text from Pippi Longstocking and feeling a tactile image of a face. This is recorded by the automated finger-tracker, developed by Björn Breidegård.