HG2051: Language and The Computer

Michael Wayne Goodman 2019.

Francis Bond, 2010, 2011, 2012, 2013, 2015, 2017.

​Traditionally linguistic analysis was done largely by hand, but computer-based methods and tools are becoming increasingly widely used in contemporary research. This course provides an introduction to skills and resources that can assist the linguist in performing fast, flexible and accurate quantitative analyses. Students will learn a scripting language (Python) and use it and the Natural Language Tool Kit (NLTK) to analyse linguistic phenomena. No previous programming experience is required: we will teach you the basics of programming, with an emphasis on useful techniques for processing languages.

Tue 11:30–14:30; TR+ 41 (Hive LHS-B1-07)

Course Outline

WeekDateContent Data Structures Readings (NLTK, DiP) Work Assigned Work Due
1 13 Aug 2019 Why do NLP? Why Python? Getting Started with NLTK
1.1; 1.5
2 20 Aug 2019 Language Processing and Python: Lists and Word Frequencies Lists, Sets 1.2; 1.3 DiP3 2.4 Assignment 1
3 27 Aug 2019 Language Processing and Python: Strings and Control Strings, Tuples 1.4; 1.6 DiP3 4.4 Assignment 2 (review) Assignment 1
4 03 Sep 2019 NLTK Text Corpora and Conditional Frequencies Dictionaries (and Functions) 2.1 2.2 2.3 DiP3 2.7 (notebook)
(Notes: inspecting, printing, and returning values)
Assignment 3 (review) Assignment 2
- 10 Sep 2019 Students' Union Day
5 17 Sep 2019 Lexical Resources and WordNet
2.4 2.5 (notebook) Assignment 4 Assignment 3
6 24 Sep 2019 Processing Raw Text
3.1 3.2 3.3 3.9 (notebook) Assignment 5
Group Project
Assignment 4
- 1 Oct 2019 Recess
7 8 Oct 2019 Regular Expressions
3.4; 3.5; 3.6; 3.7; 3.8; DiP3 5 (notebook) Assignment 6 Assignment 5
8 15 Oct 2019 Mid-review: Writing Structured Programs
4.1; 4.2; 4.3; 4.4 DiP3 2.5, 3.4 (notebook) Assignment 7 Assignment 6
9 22 Oct 2019 Bi-grams, n-grams and collocations
5.1; 5.2; 5.3 (notebook) Assignment 7
10 29 Oct 2019 Part of Speech Tagging
5.4; 5.5; 5.6; 5.7 (notebook) Assignment 8
11 5 Nov 2019 Classification
6.1 6.3 6.4 6.7 6.8 (notebook) Assignment 9 Assignment 8
12 12 Nov 2019 Final Review notebook Assignment 9 (solution)


Handy Summary of Python and NLP Concepts
Sample written exam



13 19 Nov 2019 Final In-Class On-Line Open-Book (09:30–14:30, LHS-TR+47 (Hive))
Notebook for the group task
13 20 Nov 2019 Final In-Class On-Line Open-Book (09:30–14:30, LHS-TR+47 (Hive))
Notebook for the group task
14 25 Nov 2019 Group Project due

Textbooks and Tools

Assessment and Solutions to Problems

Evaluation Criteria for Assignments

Evaluation Criteria for the Group Project

Assessment problems are generally open ended --- it is not expected that the student can solve them fully: the goal is to see how they approach the problem and understand it.


Course materials are heavily inspired by clt231: Introduction to Natural Language Processing at the University of Helsinki. Thanks to Graham Wilcock for letting us use them.

I will try not to make things too hard (cartoon from Abtruse Goose)

Instead this class should be like this (cartoon from XKCD)