Faculty of Language, Literature and Humanities - Corpus Linguistics and Morphology

Metadata

BeMaTaC – A deeply annotated multimodal map-task corpus of spoken learner and native German

All speakers are only listed once. In dialogue pairs, the first dialogue's instructor serves as the second dialogue's instructee and vice versa.

All variables and their possible values are described in detail at the bottom of this page. The last line indicates totals (dialogues and speakers), ratios (female:male and 1st:2nd in sequence), the number of occurrences (acquaintance, smoker, braces, piercing, language disorder), and averages (all other values).


L1 subcorpus


document acquainted sequence role name sex age height weight smoker braces piercing language_disorder education primary_school l1 l2
12 dialogues 2 6:6 16 speakers 9:7 26 174 69 7 0 2 2        
2011-12-14-A false 1 instructor Benni m 22 178 85 true false false false B.A. Sonderpädagogik Sittensen deu eng, pol
2011-12-14-B 2 instructor Paul m 26 171 75 true false false false B.A. Politikwissenschaft Hamburg deu eng, spa, fra
2012-01-19-A false 1 instructor Mirko m 33 184 85 true false false false Diplom-Designer Frankenberg/Sa. deu  
instructee Tim m 28 186 80 false false lower lip false B.A. Frühpädagogik Stuttgart deu eng, pol
2012-10-30-B false 2 instructor Maja f 21 164 68 false false false false Hochschulreife Berlin deu eng, fra
instructee Tom m 25 180 100 false false false false B.A. Mathematik Basdorf deu eng, fra
2012-10-31-A false 1 instructor Rosenrot f 21 170 70 false false false slight stutter Hochschulreife Berlin deu lat, eng, fra
2012-10-31-B 2 instructor Schneeweißchen f 50 178 72 false false false false Medizin Berlin deu eng, fra
2012-10-31-C true 1 instructor Mark m 20 178 60 true false false false Hochschulreife Rostock deu eng, fra
2012-10-31-D 2 instructor Mia f 20 174 60 false false false false Mittlere Reife Nürnberg deu eng
2012-11-01-A false 1 instructor Pinguin f 21 162 60 true false false false Hochschulreife Helmstedt deu eng, fra
instructee Angela f 35 164 55 true false true sibilant S Hochschulreife Heilbronn deu ita, eng, spa
2012-11-02-B true 2 instructor Karl m 22 175 70 false false false false Hochschulreife Gießen deu eng
instructee Mary-Jane f 23 165 57 true false false false Hochschulreife Berlin deu eng, ara, fra
2012-11-08-A false 1 instructor Sandra f 25 175 65 false false false false B.A. Saarbrücken deu fra, eng, spa
2012-11-08-B 2 instructor Anna f 24 175 49 false false false false B.A. Fine Arts Rheinland-Pfalz deu eng

L2 subcorpus


document acquainted sequence role name sex age height weight smoker braces piercing language_disorder education primary_school target_language_school target_language_duration target_country_duration main_language target_language_c_test l1 l2
5 dialogues 1 3:2 6 speakers 4:2 33 171 63 0 0 0 0       9;9 4;4   144    
2013-04-18-C false 1 instructor Torsten m 23 181 73 false false false false Bachelor Rechtswissenschaft Louth, Lincolnshire, United Kingdom university, Reading, Berkshire, United Kingdom   1;6 deu   eng deu, fra, spa
2013-04-18-D 2 instructor Rudi m 31 181 74 false false false false M.Sc. Statistik Sydney, New South Wales, Australia language school, Bochum, Nordrhein-Westfalen, Deutschland 10;0 7;1 deu 129 eng deu, nld
2013-04-19-A false 1 instructor Lisa f 27 163 55 false false false false B.A. German Studies Oakland, California, United States of America Portland, Oregon, United States of America 9;0 5;0 deu   eng deu, spa
2013-04-19-B 2 instructor Laura f 35 173 65 false false false false M.A. Ethnologie San Luis Obispo, California, United States of America Berlin, Berlin, Deutschland 7;0 7;3 deu, eng 154 eng deu, fra, pol
2013-05-02-A true 1 instructor Ginny f 30 160 48 false false false false Ph.D. Sprachwissenschaft St Andrews, Fife, United Kingdom secondary education, St Andrews, Fife, United Kingdom 18;6 0;9 eng 139 eng deu
instructee Hermine f 51 165 65 false false false false Ph.D. Sprachwissenschaft Los Altos, California, United States of America language school, Berlin, Berlin, Deutschland 4;5 4;5 eng 154 eng deu, spa

Dialogue metadata


  • Dialogue ID (document)

    Description: unique dialogue identifier based on the recording date

    Values: yyyy-mm-dd-X, where X is a single letter of the Latin script

  • Acquaintance (acquainted)

    Description: indicates whether instructor and instructee have known each other previously

    Values: false/true

  • Recording date (recording_data)

    Description: date of the original recording

    Values: yyyy-mm-dd

  • Dialogue sequence (sequence)

    Description: each pair of speakers does two map tasks; after the first task, both the roles and the maps are changed

    Values: 1/2


General speaker metadata


  • Pseudonym (name)

    Description: unique pseudonym for the speaker

  • Sex (sex)

    Values: f/m

  • Age (age)

    Values: integer years

  • Height (height)

    Values: integer centimeters

  • Weight (weight)

    Values: integer kilograms

  • Smoker (smoker)

    Values: false/true

  • Dental braces (braces)

    Values: false/true

  • Piercing within vocal tract (piercing)

    Values: false/location of the piercing in English

  • Speech or language disorder (language_disorder)

    Values: false/description of the disorder in English

  • Highest degree of education (education)

    Values: degree and field in the original language

  • Location of primary school (primary_school)

    Values (L1 subcorpus): municipality

    Values (L2 subcorpus): municipality, state/region, country

  • Native language(s) (l1)

    Values: ISO 639-3 codes separated by commas

  • Foreign language(s) (l2)

    Values: ISO 639-3 codes separated by commas

  • Language(s) used in the dialogue (languages-used)

    Value: deu


Additional learner metadata in the L2 subcorpus


  • School where target language was mainly acquired (target_language_school)

    Values: type of school in English, municipality, state/region, country

  • Duration of target language study (target_language_duration)

    Values: integer years; integer months

  • Duration of stay within target country (target_country_duration)

    Values: integer years; integer months

  • Main language(s) used in everyday life (main_language)

    Values: ISO 639-3 codes separated by commas

  • Language assessment: onDaF C-Test (target_language_c_test)

    Values: 0–160



Last update: 01 April 2014