Dataset Open Access

Nganasan Spoken Language Corpus (NSLC)

Wagner-Nagy, Beáta; Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor

Corpus Citation

Brykina, Maria - Valentin Gusev - Sándor Szeverényi - Beáta Wagner-Nagy 2018: “Nganasan Spoken Language Corpus (NSLC).” Archived in Hamburger Zentrum für Sprachkorpora. Version 0.2. Publication date 2018-06-12. http://hdl.handle.net/11022/0000-0007-C6F2-8

Corpus Description

The Nganasan Spoken Language Corpus, Version 0.2 has been created as part of Corpus based grammatical studies on Nganasan project (supported by the German Research Grant; WA3153/2-1) whose primary goal is to generate a digital, machine-searchable corpus of spoken Nganasan and based on this corpus, to prepare a corpus-based reference grammar of the language.

This project fills basic gaps in the existing research into Nganasan descriptive grammar creating new and more widely accessible materials and information on this lesser known and severely endangered Uralic language.

This second version 0.2 of the corpus is a subcorpus that comprises 177 communications, 136 of which contain an aligned audio recording, with glossed (Toolbox/FLEx) and annotated (EXMARaLDA) transcripts from 57 speakers. All texts have been translated into Russian and English, some also into German. The corpus also contains rich metadata on the communications and speakers.

Files (2.8 GB)
Name Size
coma-overview.html
md5:810fd8be480a6dba15a0adf26112f17b
456.1 kB Download
mp3-zip.zip
md5:15ccb2a22ede8edd3e72d9edb5533a1c
2.7 GB Download
nslc.coma
md5:5e78ebda432a040e1d7537d6701a982d
825.5 kB Download
nslc.zip
md5:c1e6d197576808a6a1c3d3470928e55f
76.6 MB Download
NSLCGuidelines.pdf
md5:d364b25a9dc08bdebb822877ae683b3f
3.9 MB Download

Cite record as