ECTS credits ECTS credits: 5
ECTS Hours Rules/Memories Student's work ECTS: 85 Hours of tutorials: 5 Expository Class: 15 Interactive Classroom: 20 Total: 125
Use languages German, English
Type: Ordinary subject Master’s Degree RD 1393/2007 - 822/2021
Departments: External department linked to the degrees, Philosophy and Anthropology
Areas: Área externa M.U Erasmus Mundus Máster Europeo en Lexicografía, Logic and Philosophy of Science
Center Faculty of Philology
Call: First Semester
Teaching: Sin docencia (Extinguida)
Enrolment: No Matriculable
- Training students to work with computer tools for linguistic data processing.
- Giving students skills to design and implement basic tools to automatically extract lexicographic information from texts.
This course presents an introduction to some basic programming methods in scripting languages (e.g. R, Python, etc), aimed at creating lexicographic resources. More precisely, the course will focus on automatic extraction of collocations and lexical relations.
1. Introduction to computational lexicography with R
2.1. Basic tasks in natural language processing (word frecuency data, token distribution analysis, etc.)
2.2. Measures of lexical variety
2.3. Functions for producing a Keyword in Context (KWIC)
2. Quantitative-empirical methods in lexicography
2.4. Introduction: Empirical research methods
2.5. Methodologies: Advantages & Shortcomings
3. Data visualisation and analysis
3.1. Introduction to visualization in R
3.2. Descriptive & inferential statistics
3.3. Data visualization
4. Collaborative lexicography
4.1. Basics of collaborative work
4.2. Crowdsourced collaborative lexicography: the Wikitionary project
4.3. Some tools for collaborative lexicography
Abel, Andrea & Meyer, Christian M. (2013). “The dynamics outside the paper: user contributions to online dictionaries”, en Iztok Kosem / Jelena Kallas / Polona Gantar / Simon Krek / Margit Langemets / Maria Tuulik, coords., Electronic lexicography in the 21st century: thinking outside the paper: proceedings of the eLex 2013 conference, 17–19 October 2013, Tallinn, Estonia. Liublliana / Tallin: Institute for Applied Slovene Studies / Institute of the Estonian Language, pp. 179–194. Available at: <http://eki.ee/elex2013/ proceedings/eLex2013_13_Abel+Meyer.pdf>
Evert, Stefan (2008). “Corpora and collocations”. In A. Lüdeling and M. Kytö (eds.), Corpus Linguistics. An International Handbook, article 58, pages 1212-1248. Mouton de Gruyter, Berlin.
Grefenstette, Gregory (1994). Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, Norwell, MA, USA.
Thalken, Rosamond & Jockers, Matthew L. (2020). Text analysis with R: for students of literature, Cham: Springer.
Mel’chuk, Igor (1998). “Collocations and Lexical Functions”. In A.P. Cowie (ed.): Phraseology. Theory, Analysis, and Applications, Oxford: Clarendon Press, 23-53.
Meyer, Christian M. / Gurevych, Iryna (2012a): “Wiktionary: a new rival for expert-build lexicons? Exploring the possibilities of collaborative lexicography”, in Sylviane Granger / Magali Paquot, eds., Electronic Lexicography. Oxford: Oxford University Press, pp. 259–595.
Müller-Spitzer, Carolin / Wolfer, Sasha / Koplenig, Alexander (2015): “Observing online dictionary users: studies using Wiktionary log files”, International Journal of Lexicography, 28/1, pp. 1–26.
Padó, Sebastian & Lapata, Mirella (2007). “Dependency-based construction of semantic space models”. Computational Linguistics. 33 (2): 161–199.
Sahlgren, Magnus (2008). “The Distributional Hypothesis”. Rivista dei Linguistica. 20(1): 33–53.
Sweigart, Ao (2015). Automate the Boring Stuff with Python: Practical Programming for Total Beginners, Non Starch Press.
Wolfer, Sasha / Müller-Spitzer, Carolin (2016). “How Many People Constitute a Crowd and What Do They Do? Quantitative Analyses of Revisions in the English and German Wiktionary Editions”. Lexikos. 26: 347-371.
Wu, Winston, / Yarowsky, David (2020). “Wiktionary normalization of translations and morphological information”. In Donia Sot / Nuria Bel / Chengqing Zong, eds., Proceedings of the 28th International Conference on Computational Linguistics , Barcelona: International Committee on Computational Linguistics, pp. 4683-4692.
(Additional references could be suggested during the module)
Students will be able to:
- Using the theoretical and methodological tools with applications in the lexicographical field.
- Using new methodologies and techniques in the scientific study.
- Recognizing the need for an interdisciplinary study.
- Consider terminology, typological, methodological phenomena (among others) under an applied perspective.
General competencies: CG1
Basic competences: CB6, CB7, CB8, CB9, CB10
Transferable competencies: CT2, CT4,
Specific competencies: CE3, CE4, CE7, CE8, CE9
- Lectures guide by the professors, conveying knowledge to students, and open to discussion.
- Lab sessions in and out the classroom following a collaborative methodology.
- Tasks previously proposed as individual work outside the classroom will be the subject of analysis and discussion in the classroom.
1. First chance: Realization and delivery of tasks for each module and active participation: 100%.
2. Second chance: Same criteria as in the first call will be applied.
Those students granted by the Faculty authorities with special permission for not attending lessons regularly will necessarily have to write a final work, which will constitute 100% of the final grade.
Academic misconduct (cheating, plagiarism in exercises or tests) will be penalized according to the University regulations on student assessment (“Normativa de avaliación do rendemento académico dos estudantes e de revisión de cualificacións”)
The number of hours for attendance in person is 35, to which we must add the individual work of students.
- It is recommended to take this subject considering the basic skills previously learnt in Introduction to Computer Science and Natural Language Processing.
- It is expected of students’ preparation –before and after– class hours.
Carlos Valcarcel Riveiro
- Department
- External department linked to the degrees
- Area
- Área externa M.U Erasmus Mundus Máster Europeo en Lexicografía
- carlos.valcarcel [at] rai.usc.es
- Category
- External area professor
Martin Pereira Fariña
- Department
- Philosophy and Anthropology
- Area
- Logic and Philosophy of Science
- Phone
- 881812525
- martin.pereira [at] usc.es
- Category
- Professor: Temporary PhD professor
Sascha Wolfer
- Department
- External department linked to the degrees
- Area
- Área externa M.U Erasmus Mundus Máster Europeo en Lexicografía
- sascha.wolfer [at] rai.usc.gal
- Category
- External area professor
Tuesday | |||
---|---|---|---|
18:00-20:00 | Grupo /CLE_01 | English, German | B06 |
Wednesday | |||
18:00-20:00 | Grupo /CLE_01 | English, German | B06 |
Thursday | |||
18:00-20:00 | Grupo /CLE_01 | German, English | B06 |
12.20.2022 09:30-13:30 | Grupo /CLIS_01 | B05 |
12.20.2022 09:30-13:30 | Grupo /CLE_01 | B05 |
01.24.2023 09:30-13:30 | Grupo /CLIS_01 | B05 |
01.24.2023 09:30-13:30 | Grupo /CLE_01 | B05 |