Predicting grammatical gender in Nakh languages: Three methods compared - LLACAN - Langage, Langues et Cultures d’Afrique Noire (UMR 8135) Accéder directement au contenu
Article Dans Une Revue Linguistic Typology at the Crossroads Année : 2022

Predicting grammatical gender in Nakh languages: Three methods compared

Résumé

The Nakh languages Chechen and Tsova-Tush each have a five-valued gender system: masculine, feminine, and three "neuter" genders named for their singular agreement forms: B, D and J. Gender assignment in languages is generally analysed as being dependent on both form and semantics (e.g. Corbett 1991), with semantics typically prevailing over form (e.g. Bellamy & Wichers Schreur 2022, Allassonnière-Tang et al. 2021). Most previous studies have considered only binary or tripartite gender systems possessing one masculine, one feminine, and one neuter value. The five-valued system of Nakh thus represents a more complex and insightful case study for analysing gender assignment. In this paper we build on the existing qualitative linguistic analyses of gender assignment in Tsova-Tush (Wichers Schreur 2021) and apply three machine-learning methods to investigate the weight of form and semantics in predicting grammatical gender in Chechen and Tsova-Tush. Our main aim is thus to show how three different computational classifier methods perform on a novel set of non-Indo-European data. The results show that while both form and semantics are helpful for predicting grammatical gender in Nakh, semantics is dominant, which supports findings from existing literature (Allassonnière-Tang et al. 2021), as well as confirming the utility of these computational methods. However, the results also show that the coded semantic information could be further fine-grained to improve the accuracy of the predictions (see also Plaster et al. 2013). In addition, we discuss the implications of the output for our understanding of language-internal and family-internal processes of language change, including how loanwords are integrated from Russian, a three-gender language.

Domaines

Linguistique
Fichier principal
Vignette du fichier
14545-Article Text-62051-3-10-20221222.pdf (778.16 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03911228 , version 1 (22-12-2022)

Identifiants

Citer

Jesse Wichers Schreur, Marc Allassonnière-Tang, Kate Bellamy, Neige Rochant. Predicting grammatical gender in Nakh languages: Three methods compared. Linguistic Typology at the Crossroads, 2022, 2 (2), pp.93-126. ⟨10.6092/issn.2785-0943/14545⟩. ⟨hal-03911228⟩
181 Consultations
57 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More