Lab 2: Describing A Corpus
Present in Week 4.
This lab will give you knowledge about a particular corpus.
Prepare a 5 minute presentation (5-6 slides) introducing a corpus to the class.
Pick a corpus that interests you (that I haven't described already
and isn't described in the week 4 slides). The corpus has to have at
least two papers about it: one describing its construction, and
another describing research using it. You will be expected to read
both papers and be able to ask questions about them. The slides
should be a very concise summary.
- Slide 1: General Corpus Description (your name as footnote)
- Slide 2~3: Overview of Annotation/Creation
- Slide 3~4: Use of the corpus in some research
- Slide 4/5: References (must have at least the following)
- one paper about corpus construction
- one paper about corpus use
- details of how to access the corpus (URL)
- Slide 5/6: ISLRN
Metadata for the corpus, presented as a human readable table.
Example Slides: The Hong Kong Cantonese Corpus
- description of/presentation about the corpus (2 marks)
- description of/presentation about the corpus creation/annotation (2 marks)
- description of/presentation about the corpus use (2 marks)
- references, formatted correctly (-1 mark if not done)
- asked interesting question(s) (2 marks)
- answered question well (2 marks)
- email me with your choice of corpus (before class in week3)
- email the slides as PDF to email@example.com (one day after class in
week 4: to give a chance for revisions)
Unless specifically requested not to, I will put your slides
up on the course home-page so that other people can also see them.
Note on formatting slides (and other things)
Read and follow the instructions carefully.
- Lose 1 mark if:
- submit pdf not powerpoint
- more or less than 5 or 6 pages
- Name not as footnote on page one
- 0 if submitted late
- Why so strict?
- For paper/grant submissions
--- your submission paper will be rejected without review if badly formatted
- In the workplace
--- you will get shouted at and made to do it again
- Why is correct formatting so important?
- The reviewer/boss has to read/process many, many submissions
- Anything that distracts them/takes extra time is bad
- Even if you forget everything about Corpora, remember this lesson
HG3051 (Corpus Linguistics) main page.
Computational Linguistics Lab
Division of Linguistics and Multilingual Studies
Nanyang Technological University
Level 3, Room 55, 14 Nanyang Drive, Singapore 637332
Tel: (+65) 6592 1568; Fax: (+65) 6794 6303