10. Part of Speech Tagging.
- Cross validation:
K-fold cross validation is a common in NLP:
"In K-fold cross-validation, the original sample is randomly
partitioned into K subsamples. Of the K subsamples, a single subsample
is retained as the validation data for testing the model, and the
remaining K − 1 subsamples are used as training data. The
cross-validation process is then repeated K times (the folds), with
each of the K subsamples used exactly once as the validation data. The
K results from the folds then can be averaged (or otherwise combined)"
E.G 5-fold: split into A, B, C, D, E
- Fold 1: Train on A, B, C, D --- test on E
- Fold 2: Train on A, B, C, E --- test on D
- Fold 3: Train on A, B, D, E --- test on C
- Fold 4: Train on A, C, D, E --- test on B
- Fold 5: Train on B, C, D, E --- test on A
- 5.6 Transformation-Based Tagging
Before Class (code)
- Train a bigram tagger with no backoff tagger, and run it on some
of the training data. Next, run it on some new data. What happens to
the performance of the tagger? Why?
TRAIN (input is tagged sentences):
bigram_tagger = nltk.BigramTagger(train_sents)
TAG (input is an untagged sentence):
EVALUATE (input is tagged sentences):
- Now add a backoff tagger (your choice) and try it again (build then evaluate). What happens to
the performance of the tagger this time and why?
backoff = ???
bigram_taggert2 = nltk.BigramTagger(train_sents, backoff=backoff_tagger)
- Think about how we could deal with sparseness by defining an 'unknown' word.
What would you have to do to the training and test data?
Practical work (code)
- Train and evaluate a bigram tagger with a data set transformed to use explicit unknown words.
How well does it perform?
- Consider the regular expression tagger described in
Section 5.4. Evaluate the tagger using its accuracy() method, and try to
come up with ways to improve its performance. Discuss your
findings. How does objective evaluation help in the development
- Give an example of POS assignment ambiguity in a language other
than English, and explain what context could be used to resolve it.
Note: computers find ambiguity where people typically will not.
HG2051: Language and the Computer Francis Bond.