Francis Bond, 2010, 2011, 2012, 2013, 2015.
Traditionally linguistic analysis was done largely by hand, but computer-based methods and tools are becoming increasingly widely used in contemporary research. This course provides an introduction to skills and resources that can assist the linguist in performing fast, flexible and accurate quantitative analyses. Students will learn a scripting language (Python) and use it and the Natural Language Tool Kit (NLTK) to analyse linguistic phenomena. No previous programming experience is required: we will teach you the basics of programming, with an emphasis on useful techniques for processing languages.
Tue 12:30–15:30; HSS COMLAB 5 (HSS-01-10)
|Week||Date||Content||Readings (NLTK, DiP)||Projects|
|1||08-11||Why do NLP? Why Python? Getting Started with NLTK||1.1; 1.5|
|2||08-18||Language Processing and Python: Lists and Word Frequencies||1.2; 1.3 DiP3 2.4|
|3||08-25||Language Processing and Python: Strings and Control||1.4; 1.6 DiP3 4.4|
|4||09-01||NLTK Text Corpora and Conditional Frequencies||2.1 2.2 2.3 DiP3 2.7|
|5||09-08||Lexical Resources and WordNet||2.4 2.5|
|6||09-15||Processing Raw Text --- Some Interesting Data||3.1 3.2 3.3|
|7||09-22||Regular Expressions||3.4; 3.5; 3.6; 3.7; 3.8; DiP3 5|
|8||10-06||Mid-review: Writing Structured Programs||4.1; 4.2; 4.3; 4.4 DiP3 2.5, 3.4|
|9||10-13||Bi-grams, n-grams and collocations||5.1; 5.2; 5.3||Project 1 Due on Friday the 16th|
|10||10-20||Part of Speech Tagging||5.4; 5.5; 5.6; 5.7|
|11||10-27||Classification||6.1 6.3 6.4 6.7 6.8|
|Handy Summary of Python and NLP Concepts|
|12||11-03|| Final In-Class On-Line Open-Book All-Day
Programming Challenge (group i)
|13||11-17|| Final In-Class On-Line Open-Book All-Day
Programming Challenge (group ii)
|9:30–15:30||Project 2 Due on Friday the 20th|
Assessment problems are generally open ended --- it is not expected that the student can solve them fully: the goal is to see how they approach the problem and understand it.
Course materials are heavily inspired by clt231: Introduction to Natural Language Processing at the University of Helsinki. Thanks to Graham Wilcox for letting us use them.
I will try not to make things too hard (cartoon from Abtruse Goose)
Instead this class should be like this (cartoon from XKCD)