11/03/2021
The first part of SIGMORPHON’s sixth installment of its inflection generation shared task will focus on typologically diverse languages. In this shared task, participants will design a model that learns to generate morphological inflections from a lemma and a set of morphosyntactic features of the target form. Each language in the task has its own training, development, and test splits. Training and development splits contain triples, each consisting of a lemma, a target form, and a set of morphological features, provided in the UniMorph format. Test splits only provide lemmas and morphological tags: your model will need to predict the missing target form.
The model should be general enough to work for natural languages of any typological patterning. For example, Tagalog verbs exhibit circumfixation; thus, a model with a strong inductive bias towards suffixing will likely not work well for Tagalog.
Through the sustained effort of 80+ people, our group was able to add about 50 new languages to the UniMorph resources as well as curate and canonicalize existing resources. During the first phase of the task, we have released data for the 35 languages and others will follow in a month’s time or so.
You could register for the task here -
https://forms.gle/tu4tX648F9kA9eps7
For more info, please visit the task website -
https://github.com/sigmorphon/2021Task0 -1-generalization-across-typologically-diverse-languages
Contribute to sigmorphon/2021Task0 development by creating an account on GitHub.