Testsuite Specifications



Back to top

Test suites: General guidelines

Back to top

Grammatical phenomena

This section describes some of the grammatical phenomena we could cover and the type of examples you should use to illustrate them. We will only cover a subset of these, and may add other phenomena.

Basic Word Order

In declarative matrix clauses, what is the order of the major constituents (subject, verb, complements)? Be sure to consider both intransitive and transitive verbs, and optionally ditransitive verbs in you can find any.

Do sentences in your language typically contain auxiliaries? Where does the auxiliary occur with respect to the verb?

Your ungrammatical examples in this section should explore all of the possible orders that are not allowed in your language. Likewise, your grammatical examples should illustrate all the possible orders. For a transitive verb, then, we expect six sentences (all possible orders of S, V, and O) with the number of grammatical v. ungrammatical sentences varying depending on the language type.

In some languages, full NPs are arguably always adjuncts (topics, etc.) with the valence requirements of the verb being filled by affixes. If your language seems to fall into this type, you should still try to find examples with full NPs illustrating where they can occur, but please discuss this in your write up.

Another more complicated word-order pattern is "V2", where the verb (or a finite auxiliary) must be the second thing in the clause, but anything can come first (S, O, adverb, etc). If your language is described as V2, your word order examples should include ungrammatical examples where the verb is not in second position, and a variety of grammatical examples where it is.

If your language is strict about the order of S and O, be careful in your assignment of (un)grammaticality in examples where S and O are reversed. Whether the string is strictly ungrammatical or actually just means something different depends on how else your language marks subjects and objects (e.g., with case or agreement). For example, in English, Cats chase dogs and Dogs chase cats are both grammatical, they just mean different things.

Please use full NPs for the basic word order examples (i.e., something like the cat or cats instead of Fluffy or them).


We'll be addressing agreement, and to get interesting subject-verb (or object-verb) agreement, it's useful to have non-third person forms.

Find the paradigm(s) for pronouns in your language: Do they vary by person, number, gender, case?

Do pronouns have the same distribution as full NPs in your language? (I.e., can they appear in the same places in the string?) If not, add some examples illustrating the contrast to your test suite. In either case, add some examples to your test suite illustrating the full paradigm. (Ungrammatical examples involving case and agreement will come up under those topics below.)

The rest of the NP

This section is meant to address the other obligatory elements of an NP (particularly, an NP headed by a common noun). Does the language require determimers? Require determiners only with certain nouns? Allow determiners optionally? Allow determiners even with proper nouns and pronouns? What is the order of the determiner with respect to the noun? The space of examples here (abstracting away from order) should look something like this:

Note that n1, n2, and n3 are supposed to be of different types (e.g., nouns that require determiners always, nouns that disallow determiners always, and nouns that optionally allow determiners). Not all languages will necessarily have three types.

Is there anything else that is required with NPs?

(Note that one common thing is case-marking adpositions. If your language has these, you'll want to illustrate them under "case".)

Argument optionality

This section concerns whether and under what circumstances verbal arguments can be left unexpressed. Languages vary in the restrictions they place on unexpressed arguments. For subjects and objects each, find out if your language:

Based on the answers to the questions above, create examples for your test suite. Here a typical set of examples would be (again, abstracting away from word order, and assuming that agreement markers are optional with overt arguments but obligatory with missing ones):

If your language has lexically-licensed (or lexically-restricted) pro-drop, you should include two sets of examples like the above (one with a verb that does allow a dropped object and one with a verb that does not).


Agreement is covariation in form between multiple items (typically a head and a dependent) in a sentence. Sometimes, the head doesn't change form, but the dependent does, depending on properties of the head.

Languages vary greatly in how much agreement they display from none at all to quite a bit. You can categorize agreement systems along two dimensions:

  1. Which elements are in the agreement relation (subject & verb, object & verb, determiner & noun, adjective & noun are typical)
  2. Which features are involved (person, number, case and gender are typical)

Determine whether your language has agreement, and if so, which type. Then construct examples showing both grammatical and ungrammatical possibilities. Remember, the ungrammatical examples should only have one thing wrong with them (i.e., if you have both subject-verb and object-verb agreement, there's no need to make an example where both the subject and the object disagree with the verb).

For a language with determiner-noun agreement in number and case subject-verb agreement in person and number, a possible example set is:

Notice the inclusion of non-third person examples (using pronouns). You could pair e.g., every verb form with every kind of noun it doesn't agree with, but that's not strictly necessary. As long as each noun type and each verb form show up in at least one ungrammatical examples, most errors should get caught.


A language has a case system if the nouns vary in form depending on the grammatical role they play in a sentence and/or the specific head they are a dependent of. (In some languages, it's not the nouns themselves that vary in form, but the dependents of the noun, such as determiners or adjectives. We'll analyze this as the nouns still having case and their dependents agreeing with them in case.)

Some languages have no case system. Among languages that do have case systems, they can be characterized along the following parameters:

Determine if your language has a case system, and if so, where it falls on the above parameters. If your language has a direct-inverse system, even though that isn't strictly case, please consider it here.

Create examples showing both grammatical and ungrammatical case patterns. For the purposes of this class, you only need to consider the case of subjects, direct objects, and (if you found some ditransitives) indirect objects. If your language has quirky case, you should consider including both verbs that illustrate the major case patterns and verbs that have idiosyncratic patterns. If your language has split-ergativity, you should illustrate both sides of the split.

For a nominative-accusative language (with a distinct dative case) without quirky case, a typical example set would look like this (assuming SVO word order):

If your language also has agreement, or obligatory determiners, etc., make sure that the examples have the appropriate form along those dimensions.


Here we're interested in how your language handles sentential (NB: not constituent) negation. Again, there are a few basic strategies that should cover most cases:

  1. Inflection: An affix (prefix, suffix, infix) expressing negation, which attaches either to auxiliaries only, main verbs only, any (finite) verb.
  2. Independent adverb: An adverb that modifies a V, a VP, or an S; and attaches to the left, right, or either side.
  3. Selected adverb: An adverb that appears as a selected complement of auxiliaries only, main verbs only, any (finite) verb.

It can be subtle to distinguish between options 2 and 3, and chances are the data you'll be able to find won't be sufficient to do so. For option 1, positive test suite examples should show the negated verb/auxiliary. Negative examples should show the negation inflection on verbs that is not allowed on (non-finite verbs? main verbs?). For options 2/3, positive examples should show the negative adverb in the positions it can appear in; contrasting negative examples should illustrate where it cannot appear.

Some languages have both inflection and an adverb, in which case there are the following logical possibilities regarding their coocurrence:

  1. Both must be used together.
  2. Either one can be used separately, but they cannot be used together (complementary distribution).
  3. They can be used indepdently or together.
  4. The adverb is obligatory, but the inflection is optional.
  5. The inflection is obligatory, but the adverb is optional.

If your language allows both strategy types, determine (if you can) the rules of their coocurrence, and illustrate with appropriate positive and negative examples in your test suite.

Matrix yes-no questions

How does your language indicate matrix clause (i.e., not embedded) yes-no questions? Possible strategies include word order variations, a sentence-initial or sentence-final question particle, a special auxiliary, and intonation only.

Your testsuite should include positive examples illustrating all of the strategies in your language. If you have any strategies that involve additional lexical material (e.g., a question particle), create negative examples with the question particle in the wrong place.

If your language indicates questions with word order variations, go back to your negative examples under word order and check whether anything marked as ungrammatical there is really grammatical as a question. This is where we begin to see that a finished testsuite should pair strings with analyses --- this is done implicitly here in the free translations.

Consider creating examples of negative questions (i.e., sentences simultaneously illustrating both negation and yes-no questions).

Embedded clauses (declarative, interrogative)

Try to find at least one verb that can embed finite declarative clauses and at least one verb that can embed finite interrogative clauses. How are the embedded clausees marked? Does the language use complementizers? Is the word order different between matrix and embedded clauses? Are there different complementizers for embedded declaratives v. interrogatives? Do the selecting verbs allow both kinds of clausees (e.g., English know) or just one (e.g., English ask). Create grammatical and ungrammatical exmaples to illustrate any contrasts that you find. Restrict your attention to yes-no questions. (That is, no wh- questions.)

Note that I am not looking for embedded clauses functioning as modifiers (e.g., relative clauses, clauses marked by when or because, non-finite clauses expressing simultaneous action). Instead try to find examples similar to these:


How does your language express the meaning associated with English can in I can eat glass? The two major possibilities are an independent auxiliary like in English or an affix on the main verb. Alternatively, you might find only periphrastic means ("It is possible for me to eat glass.") If there's an auxiliary, you might find that it's a subject-raising verb (like in English), or that it does argument composition: in this case, the auxiliary takes the lexical verb as its complement and then adopts all of the verb's arguments (subject and complements) as its own. You can tell this is going on when the arguments of the verb are ordered with respect to the auxiliary rather than the verb.


Explore how coordination is marked in your language. Coordination is, very informally, the sort of phrasal combination marked by "and" in English. In some languages (like English), this is simple: a single lexical item can coordinate any kind of phrase. In other languages, coordination might be marked by adding an affix, lengthening a vowel, or changing to another tense -- the variety of marking strategies is surprising. Languages also vary as to how many coordinands must be marked: all of them (and A and B and C...), just one (A B and C), or none of them (A B C...). Also, some languages have different ways of marking coordination for different phrase types. If this is the case, it will be interesting to illustrate at least two different strategies.

Extracting this information from your written grammar can be challenging. Coordination is described in different sections in different grammars: in a separate section of its own, in a section that also describes subordination (often titled "Conjuctions"), or possibly spread out over the sections that describe each phrase type (i.e. nouns, verbs, adjectives, etc). Some grammars provide very little information beyond "the word for 'and' is FOO"; some are more detailed. Collect what information you can find, especially example sentences that have the word "and" in their gloss or translation. Consider doing a Google search (or other web search) based on the spelling of the morpheme for "and" to get some naturally occurring examples to supplement what you can get from your reference material. One last thing to be aware of: some languages mark coordinated meanings using a word or inflection meaning "with", but don't seem to actually form tightly bound coordinated constituents.


What tense and aspect categories are marked in your language, and how are they marked?

Tense has to do with the time of the event in relation to the speech time, and usually involves categories like "past, present, future", though not all languages make a three-way distinction, and some languages allow more fine-grained distinction. Aspect has to do with the internal temporal structure of the event, and how it is viewed or portrayed in the utterance. Under the heading aspect, you might find categories like "progressive, habitual, perfective, durative, inceptive" and others.

Tense and aspect can be expressed by auxiliaries, affixes, particles or combinations thereof.

You should collect at least one way of expressing each tense category marked in your language, except for "perfect" tenses (e.g., English future perfect Kim will have gone by then.). If your language marks any aspect categories with auxiliaries, affixes, or particles, try to collect roughly three different aspects as well. Note that when a language doesn't have a grammaticalized means of expressing a particular tense or aspect category, it can usually still get it across by paraphrasing. Thus English Kim began to swim. doesn't count as inceptive aspect for our purposes.

Ungrammatical examples can be hard to come by in this category, but you should be able to construct some by showing illicit combinations of auxiliaries + main verb forms, or multiple incosistent tense affixes or particles.

Back to top


[Phenomena code: cognitive status/cogst]

Demonstratives and definiteness are two means of indicating the cognitive status (or discourse status) of the referent of a noun phrase. Demonstratives are elements like English this and that which canonically participate in a system that distinguishes degrees of distance from the speaker and can be used to draw a hearer's attention to something physically present (cf. Dryer, 2008). Depending on the language, demonstratives can be determiners, adjectives, or affixes.

Determine how demonstratives are marked in your language, and the distinctions (especially in terms of distance or related notions) that are expressed in the system. Illustrate the range of possibilities with examples. Ungrammatical examples can be constructed by putting the demonstrative in the wrong place in the string (e.g., a prefix used as a suffix, etc.) For present purposes, don't worry about demonstrative pronouns, i.e., those that can stand alone without a (separate) head noun.

Definitness may be marked as inflection on the noun, through determiners, through choice of case particles, or some combination (and perhaps indeed other strategies as well). Nominal dependents may agree with their head nouns in terms of definiteness. In English, we mark definiteness with determiners (the vs. a).

Determine if definiteness is marked in your language, and if so how. Construct relevant positive and negative examples illustrating these possibilities. If any elements agree in definiteness, include examples of non-agreement.

Reference: Matthew S.. 2008. Order of Demonstrative and Noun. In: Haspelmath, Martin & Dryer, Matthew S. & Gil, David & Comrie, Bernard (eds.) The World Atlas of Language Structures Online. Munich: Max Planck Digital Library, chapter 88. Available online at http://wals.info/feature/88 Accessed on 2009-01-26.

Back to top

Attributive adjectives

Can adjectives appear within the NP? If so, where? (Left or right of the N, separated or not by other elements, separated to anywhere in the sentence.) Do the adjectives agree in any features with the nouns they modify? Create grammatical and ungrammatical examples illustrating all of these properties.

Back to top


Adverbs usually constitute a large and diverse class. For the purposes of this course, please focus on manner adverbs like quickly. Where can they appear in the sentence? Can they attach to V, VP, S? Do they attach to the left or to the right? Create grammatical and ungrammatical examples illustrating the placement possibilities.

Non-verbal Predicates

Copular or copulaless clauses. Can NPs, adjectives, or adpositional phrases function as predicates in your language? If so, do they require a copula in some or all cases? Does the form of the copula vary? Examples from English include:

Note that in some languages, the copula is required only in non-present tense, or only with NP predicates, etc. In other languages, there is not really a class of adjectives distinct from stative verbs.

Construct relevant positive and negative examples illustrating how your language handles non-verbal predicates.

Information Structure

The morphosyntactic marking of topic and focus. In some languages, this takes the form of a particular construction, like clefts in English for marking focus (It was KIM who left.). In others, a particular sentence position (pre-verbal, sentence-initial, other) is associated with a particular information sturtucal status. A third type of strategy is morphological marking, where a particular ending or particle is used to mark topic or focus. Many, if not all languages, also use prosody to mark information structure. Since we are working with textual representations, we won't be analyzing prosodic marking. Rather, please try to determine if there is any morphosyntactic marking of information structure. If so, determine:

Semantic Decomposition

Sometimes it makes sense to treat a single lexeme as decomposable into two or more predicates. For example, in English someone can be thought of as some(x), person(x). You should give examples of decomposable predicates in your language, as well as the closest multi-word equivalent:

Constructional Semantics

Some times a construction can add meaning, for example the bare noun phrase rule, which adds a determiner. Give another example (NP->PP is a very common one) paired with a multi-word equivalent.

Back to top


The test suites should be initially produced as plain text files (in ascii or unicode). We have a perl scripts that turns the plain text into the required format for [incr tsdb()] on the other. In order to do this, the formatting of the plain text file has to be relatively strict.

Your test suite file should consist of a header, containing information pertinent to all of the examples, followed by a list of examples. The header should contain the following information:

Language: <language name>
Language code: <Ethnologue language code>
Author: <your name>
Date: April 7, 2006
Source a: <Reference to grammar/web page>
Source b: <Reference to grammar/web page>
... (as many sources as needed)

Each example should consist of the following. (The { } are optional).

#Ex number and optional comment
Source: {a:page, b:page, author, elicited, attested}
Vetted: {t, f, s}
Judgment: {g, u}
Phenomena: {word order, case, agreement, ...}
<Example in standard orthography> (one of this and transliteration is required; including both is okay)
<Example in transliteration> (one of this and standard orthography is required; including both is okay)
<Example with morpheme boundaries noted and morpheme forms regularized>
<Morpheme-by-morpheme glosses>
<Free translation>

The comment character is '#', and it is good practice to number your examples in a comment line above the Source: line. For ungrammatical examples, this comment should also indicate what is wrong with the example, for your reference and for mine.

The source field indicates where the example came from. If it came from one of your written sources, you can refer to that source with a single letter code. If there is a page number associated with the example, it should follow the letter (with a colon in between). If you made the example up, the source should be author. If you elicited the example from a native speaker, then the source should be elicited. If you found the example in a non-linguistic text, the source should be attested.

The vetted field indicates where the judgment on the example came from. t means the example has been vetted by a native speaker, who gave the judgment indicated. f means it has not. (In this case, the judgment is your best guess based on the grammatical materials you have.) If you are a native speaker, you can vet your own examples. If the example comes from a grammar (which indicates a grammaticality judgment for it explicitly), and you haven't had it vetted in addition, you should put s in this field. This is meant to indicate that we think the example was vetted before being included in the grammar, but since we didn't do it, we're not sure. For attested examples, you should use t if you checked it with a native speaker and f if you have not.

The judgment field indicates the gramamticality judgment assigned to the example (either by a native speaker, in a grammar, or your best guess). g is for grammatical and u is for ungrammatical.

The phenomena field is a list of phenomena illustrated in the example. We'll have the perl script recognize both long and short names for each phenomenon, according to the table below. A single example might illustrate multiple phenomena. However, ungrammatical examples should have only one thing wrong with them, and be tagged only the phenomenon tag corresponding to that problem.
Long nameShort name
Word orderwo
Case-marking adpositionsadp
Argument Optionalitypro-d
Matrix yes-no questionsq
Embedded declarativesemb-d
Embedded questionsemb-q
Tense Aspect Moodtam
Non-Verbal Predicatescop
Cognitive statuscogst
Serial Verb Constructionssvc
Numeral Classifiersnumcl
Information structureinfo

The standard orthography and transliteration lines give a canonical respresentation of the string. You should have at least one of these, possibly both. Whatever you do, it should be consistent for the whole file.

The example should also be presented with the morpheme boundaries explicit. This will allow us to write a perl script that aligns glosses with each morpheme. For languages with particularly complex morphophonology, you might end up using this line as the example you actually parse/generate (to abstract away from the phonological rules). It is for this reason that this line should have phonologically regularized forms. In this line, morpheme boundaries should be indicated with hyphens and word boundaries with spaces. If your language has clitics, the boundary between a clitic and its host should be marked with an equals sign.

The next line is the morpheme-by-morpheme glosses. These should be in a one-to-one correspondence with the morphemes, so if the nth word in the line above has two hyphens in it, the nth word in this line has two hyphens as well. Stems should be given English glosses indicating their meaning. For formatting and a good set of grammatical abbreviations to use, please follow the Leipzig glossing rules. (You might also want to refer to the standardized set of grams from ODIN.)

The final line for each entry should have the free translation of the example.

Here is an example sentence that I just made up from Japanese (with only a transliteration and not a standard orthography line for the moment):

Source: author
Vetted: f
Judgment: g
Phenomena: {case, negation}
Keeki-wo tabenakatta.
Keeki-wo tabe-nai-ta
cake-acc eat-NEG-PRF
`(Someone) didn't eat cake.'

And a couple from French:

Source: author
Vetted: f
Judgment: g
Phenomena: {word order}
J'ai mangé le gâteau
Je-ai mange-é le gâteau
I-have.1sg eat-PRF the.M.SG cake
`I have eaten the cake.'

Source: author
Vetted: f
Judgment: u
Phenomena: {word order}
J'ai le gâteau mangé
Je-ai le gâteau mange-é
I-have.1sg the.M.SG cake eat-PRF
`I have eaten the cake.'

(NB: I'm going with the analysis of so-called French "clitics" as affixes.)

Here's a long example from Japanese to illustrate. This is good:

Source: a
Vetted: s
Judgment: g
Phenomena: {coordination, servial verb}
Ima made takusan hon-wo yonde kimashita ga, kore kara mo yonde iku tsumori desu.
Ima made takusan hon-wo yom-te ki-mas-ta ga kore kara mo yom-te ik-u tsumori dseu.
now until many book-ACC read-PTCP come-HON-PRF CONJ here from also read-PTCP go-IPFV intention COP.HON.IPFV
`Up to now I have read quite a few books and I intend to read from now on, too.'

Do NOT do it this way. (I'm only including the IGT lines here to keep folks from looking at this quickly and copying.)

Ima made takusan hon-wo yonde kimashita ga,
kore kara mo yonde iku tsumori desu.
Ima made takusan hon-wo yom-te ki-mas-ta ga
kore kara mo yom-te ik-u tsumori dseu.
now until many book-ACC read-PTCP come-HON-PRF CONJ
here from also read-PTCP go-IPFV intention COP.HON.IPFV
`Up to now I have read quite a few books and I intend to read from now on, too.'

Do NOT do it this way either:

Ima made takusan hon-wo yonde kimashita ga,
Ima made takusan hon-wo yom-te ki-mas-ta ga
now until many book-ACC read-PTCP come-HON-PRF CONJ
kore kara mo yonde iku tsumori desu.
kore kara mo yom-te ik-u tsumori dseu.
here from also read-PTCP go-IPFV intention COP.HON.IPFV
`Up to now I have read quite a few books and I intend to read from now on, too.'

Back to top

Back to course page

Francis Bond


Course materials borrow heavily from Linguistics 567: Knowledge Engineering for NLP at the University of Washington. Thanks to Emily Bender for letting us use them.