LABORATORY ON LINGUISTIC DATA ANALYSIS
Stampa
Enrollment year
2018/2019
Academic year
2018/2019
Regulations
DM270
Academic discipline
L-LIN/01 (GLOTTOLOGY AND LINGUISTICS)
Department
DEPARTMENT OF HUMANITIES
Course
THEORETICAL AND APPLIED LINGUISTICS; LINGUISTICS AND MODERN LANGUAGES
Curriculum
PERCORSO COMUNE
Year of study
Period
(25/02/2019 - 05/06/2019)
ECTS
6
Lesson hours
36 lesson hours
Language
Italian
Activity type
ORAL TEST
Teacher
JEZEK ELISABETTA (titolare) - 6 ECTS
Prerequisites
Familiarity with basic notion in general linguistics, particularly morphology, syntax, semantics and pragmatics, as they are offered in the three-year Bachelor's degrees in Humanities.
Learning outcomes
The aim of the course is to provide the students with the knowledge and skills needed to collect and examine linguistic data from a variety of perspectives, and be acquainted with digital resources such as corpora, lexicons, concordance tools, databases, knowledge bases, datasets, and ontologies. At the end of the course the students will be able to autonomously design and perform a linguistic analysis using methodologies primarily based on manual or semiautomatic annotation of data, with the goal of extracting or verifying linguistic generalizations for theoretical or applied purposes.
Course contents
DATA AND MODELS FOR MULTILINGUAL RESOURCES

This year's course introduces the students to the variety of data available for linguistic analysis (digital corpora, acceptability judgments, elicited data, experimental data, etc.), focusing the attention on multilingual resources.

With the help of selected readings, we examine the creation, the annotation and the structure of these resources and we use them in the lab for linguistic analysis and applications in natural language processing tasks.
Teaching methods
Face-to-face interactive lectures
Slides
Seminars with group presentations of the readings and discussion
Lab
Reccomended or required readings
Textbook

Jezek, Elisabetta. 2016. The Lexicon: An Introduction, Oxford: Oxford University Press.

Readings

Abzianidze Lasha, Bjerva Johannes, Evang Kilian, Haagsma Hessel, van Noord Rik, Ludmann Pierre, Nguyen Duc-Duy, Bos Johan. 2017. "The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations". In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp 242–247, Valencia, Spain.

Bender, Emily M. 2016. "Linguistic Typology in Natural Language Processing". Linguistic Typology 20(3):645-660.

Baisa Vít, Može Sara and Irene Renau. 2016. "Multilingual CPA: Linking Verb Patterns across Languages." In: Margalitadze, Tinatin and George Meladze (eds) Proceedings of the XVII Euralex International Congress: Lexicography and Linguistic Diversity, pp. 410-417.

Boas, Hans C. 2005. "Semantic frames as interlingual representations for multilingual lexical databases." International Journal of Lexicography 18, no. 4, pp. 445-478.

Fellbaum Christiane, and Piek Vossen. 2012. "Challenges for a multilingual wordnet." Language Resources and Evaluation 46, no. 2, pp. 313-326.

Havasi Catherine, Speer Robert, and Jason Alonso. 2007. "ConceptNet 3: a flexible, multilingual semantic network for common sense knowledge." In Recent advances in natural language processing, pp. 27-29. Philadelphia, PA: John Benjamins.

Navigli, Roberto, and Simone Paolo Ponzetto. 2012. "BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network." Artificial Intelligence 193, pp. 217-250.

Pianta Emanuele, Bentivogli Luisa and Christian Girardi. 2002. MultiWordNet: Developing an aligned multilingual database, in: Proceedings of the 1st International Global WordNet Conference, pp. 21–25.
Assessment methods
Final oral exam covering the material from the entire course.
Final assignment (5 pages) reporting goal, research question, methodology and results of an in-depth corpus-based analysis of a linguistic phenomenon previously agreed during office hours. The text in pdf format must be sent to jezek@unipv.it 7 days before the exam.
Further information
Material for the course - including the updated list of readings, the slides of the lectures, links to linguistic resources, instructions for the final assignment - are available on the KIRO platform (access with personal username and password).
Sustainable development goals - Agenda 2030