Lab 7

Preliminaries

These instructions might get edited a bit over the next couple of days. I'll try to flag changes.

As usual, check the write up instructions first.

Requirements for this assignment


Run a baseline test suite

Before making any changes to your grammar for this lab, run a baseline test suite instance. If you decide to add items to your test suite for the material covered here, consider doing so before modifying your grammar so that your baseline can include those examples. (Alternatively, if you add examples in the course of working on your grammar and want to make the snapshot later, you can do so using the grammar you turned in for Lab 6.)


Background

The goal of this lab is to be able to parse the two sentences I can eat glass. It doesn't hurt me., assign them appropriate semantics, and generate back. You have already done some of the work: from previous labs, your grammar should already handle pronouns, case (if applicable), and transitive verbs. Negation may already be working from the customization system and/or previous work you've done. You may need to add some vocabulary and possibly some verb forms. In addition, depending on how the sentences translate in your language, you might need to consider a new valence pattern for verbs and a new type of nouns (mass nouns).

Semantic representations

Your semantic representations for the two sentences should look approximately like this, modulo the relations showing up in a different order, the variables (e's, x's, and h's) showing up with different numbers, the SEMSORT information showing up in different places. Also, if your language tends to use prodrop rather than overt pronouns, you might end up without any representation of the pronouns in these sentences. Finally, if you need a complex predicate in place of, say, "hurt", then you'll also have some differences.


Modals

can as an auxiliary verb

Use this version if in your language the morpheme expressing the same notion as can is a separate word which takes a VP complement and a subject.

can as a bound morpheme

Use this version if the morpheme expressing the same meaning as can in your language attaches morphologically to the main verb of the sentence.


Negation

Two-part negation

Use this version if your language expresses negation with both an affix on the verb and an adverb (e.g., French ne ... pas). If both elements are arguably affixes, you probably just want to write a pair of lexical rules, i.e., take the "Negation as a verbal affix" route, but write two rules and make sure you can require that they both apply or neither apply.

Negation: markers on either end

This option is for languages that mark negation with particles on either end of the clause or VP (or alternatively, with intonation or [in signed languages] non-manual signs which extend the length of the clause/constituent and are represented in transcription with markers on either end).

If the two markers show up immediately adjacent to the verb (rather than VP or S), consider whether it might be more appropriate to treat them as inflection.

The first thing is to consider whether there is any evidence for attaching the markers one at a time. In these instructions I focus on the case where there is not, so please contact me if you think the markers should attach one at a time in your language. Rather than attach one of these markers before the other, the most straight-forward thing appears to be to create a ternary rule. I've added some types supporting ternary rules to the matrix (included in the patch provided last week).

We're going to take a construction-y approach to analysis, creating a phrase structure rule which calls for specific elements in two of the three daughters and does the right thing in the semantics itself. Specifically, do the following:

Negation as an adverb modifier

If your language uses an adverbial strategy, the customization script probably did the right thing. This is included just in case.

Use this version if your language expresses sentential negation via an adverb which modifies the V, VP or S.

(Note: English has two forms of sentential negation "contracted", which is actually an affix on the verb, cf. Zwicky and Pullum 1983, and the full-form adverb. This adverb is not actually treated syntactically as a modifier in sentential negation, but rather selected by auxiliary verbs, including the do of do-support. For the details of this analysis, see Sag, Wasow and Bender 2003 chapter 13 and Kim and Sag 1995. I would be surprised if another language being treated in this class had a system very similar to the English one, as it seems like a pretty quirky part of English grammar. Further, it's a subtle matter to establish what is actually going on in English, and I don't think anyone would have time in one week to show the same about another language.)

Negation as a verbal affix

If your language uses an adverbial strategy, the customization script probably did the right thing. This is included just in case.

Use this version if your language expresses sentential negation by adding a morpheme to the main verb.


Grammar clean up

This section asks you to find something about your grammar which needs fixing, and fix it (with help from me). This could be something that isn't quite working right from previous labs, or something that is important in your language but doesn't look like it will be otherwise covered in the class.


For your final documentation: Write up your analyses

For each of the following phenomena, please include the following in your write up:

  1. A descriptive statement of the facts of your language.
  2. Illustrative IGT examples from your testsuite.
  3. A statement of how you implemented the phenomenon (in terms of types you added/modified and particular tdl constraints).
  4. If the analysis is not (fully) working, a description of the problems you are encountering.
  5. A statement of whether or not you can generate from examples illustrating the phenomenon.

In addition, your write up should include a statement of the current coverage of your grammar over your test suite (using numbers you can get from Analyze | Coverage and Analyze | Overgeneration in [incr tsdb()]) and a comparison between your baseline test suite run and your final one for this lab (see Compare | Competence).



Back to main course page
Francis Bond

Course materials borrow heavily from Linguistics 567: Knowledge Engineering for NLP at the University of Washington. Thanks to Emily Bender for letting us use them.