HG3051: Lab 1: Searching the BNC

This lab will give you practice searching through a part-of-speech-tagged corpus, the BNC, using BYU's online interface.

Upload the final paper here as pdf
It should be called hg3051-lab1-name-misc.pdf or hg7032-lab1-name-misc.pdf

Replace the ??? with your answers.

Q1: (3)

Here's a list of some wordforms that can be nouns, verbs, or
adjectives. For each, determine the frequency of each the wordform
used in each grammatical category. [Only look at these wordforms, not
inflected/derived variants, and don’t worry about other,
e.g. prepositional, uses.]

Word form	N freq	V freq	A freq
green		???	???	???
top		???	???	???
fly		???	???	???
meet		???	???	???

Do these agree with your intuitions?


Q2: (4)

List some examples of idioms of the type "the Xer the Yer", e.g. "the
more the merrier", the "bigger the better", and show the quer(y|ies) 
you used to find them.


What kind of words appear in the first slot (X)?


What kind of words appear in the second slot (Y)?


Do they fall into neat semantic classes? 


Are the classes independent?


Q3: (3)

It is often claimed that "less" is used with uncountable nouns and
"few" with countable nouns. 

Design some queries to test this claim and show the queries and results:


How accurate is the claim, according to the BNC?


What would be a more accurate claim?


HG3051 (Corpus Linguistics) main page.

Francis Bond <bond@ieee.org>
Computational Linguistics Lab
Division of Linguistics and Multilingual Studies
Nanyang Technological University
Level 3, Room 55, 14 Nanyang Drive, Singapore 637332
Tel: (+65) 6592 1568; Fax: (+65) 6794 6303