Analysis and Rules of Grammar I

Traditionally, morphology dealt with the segmentalization of words into morphemes, and distribution of the allomorphs of a given morpheme. More recently it has come to be included in writing the grammar of a language. We will be covering both these approaches in this class.

2. Let us start with a simple nominal paradigm: the English noun. There are two forms for each noun (with some exceptions): the singular and the plural forms. Consider the following forms: saint and saints. In phonemic representation they are /se:nt/ and /sents/,1 respectively. Semantically, these forms are the same, except that one is singular and the other plural. Our first goal is to segmentalize these two words such that one common sequence of phonemes refers to the referent 'saint', and whatever is left over refers to singular and plural. In a paradigm as simple as this, virtually all English speakers and perhaps many non-English speakers would break the two words such that /se:nt/ refers to "saint":

If /se:nt/ cannot be broken down into smaller units that have a function, the form is called a morph. Hence, /se:nt/ is a morph. There are also the sequences /se:/, /se:n/, /e:nt/, /nt/ and so forth, that occurs a morphemes or a sequence of two morphemes (e:+nt "ain't"), but they are not related to /se:nt/. These morphemes have no function in the word 'saint' in English.

In the plural form, there is the unit /s/ left over: /se:nt/+/s/. If /se:nt/ refers to "saint", what does /s/ refer to; i.e., what is its function? Virtually everyone agrees that it refers to or marks plurality;

Before we analyze the singular form, let us introduce the term morpheme. We will define morpheme as a set of one or more morphs that have a common function. A morpheme must contain at least one morph; it may contain more than one. Considering the above data set, the morpheme saint contains one morph saint, and the plural morpheme contains one morph. Below, we will introduce the term allomorph, but not at this moment. A morpheme is an abstraction (a set); we will write the morpheme in caps enclosed in braces: {SAINT}. Thus, {SAINT} has one morph /se:nt/. The same is true for plural {S}: /s/.

Any group of object may form a set. A set may also be null. This will be important later. For example a dog and Mt. Everest could be defined as a set, but this set is very unlikely to be useful. In linguistics, a set ideally should be a set of common features. All the phonemes of a language form a set. The subsets it includes are each one of the phonemes. For example, the set of Hawaiian phonemes is {p, t, k, h, l, r, n, m, w, a, i, e, o, u}. The set of Hawaiian consonants is {p, t, k, h, l, r, n, m, w} and of Hawaiian vowels {a, i, e, o, u}. Each phoneme is also a set. Given the subsets (sets within a set), we must consider Hawaiian phonemes to contain the subsets consonants and vowel, and the set of Hawaiian consonants to contain each Hawaiian consonantal subset, and the set of vowels contains each Hawaiian vocalic subset. Each subset may contain another subset, the last subset contains a member. Members do not contain sets. They are final or ultimate. There is no theoretical limit to the number of subsets. The number of subsets is an indefinite number.

{SAINT} is the basic lexical form--it is a lexical morpheme. A lexical morpheme contains the lexical meaning of a word. In the syntax (Course Outline 322) we show how inflectional morphemes are added to the lexical morpheme to derive a compound morpheme. A compound morpheme consists of a lexical morpheme and one or more grammatical features. In the syntax we derive just two compound nominal morphemes:

Each lexical morpheme contains a set of inherent grammatical features and a set of empty (NIL) grammatical features which must be assigned a value. Each lexical item is also composed of a set of semantic features. We will not cover them here. We will be concerned with grammatical features. The most well known grammatical feature is [±Plural], which is an empty feature in the lexical morpheme underlying the noun. A syntactic rule copies the polar value of the feature to the empty feature position form the quantifier. We thus get the following related forms:

The rules of English morphology apply to the singular and the plural forms that spell out the appropriate forms.

Let's start with [-Pl]. A common hypothesis is that [-Pl] is realized as a zero suffix: SAINT+[º]. We take exception to this hypothesis. Zero forms including affixes should be avoided at all costs unless there is overwhelming evidence to support them. We can find none. The zero argument seems to be based on the conviction that the grammatical feature of plurality must be realized as a suffix at any cost. Our argument is based on the Least Effort Principle (LEF):

There is no compelling reason to expand SAINT+[-Pl] to SAINT+[Nsuff º]. We propose instead that SAINT+[-Pl] is simply spelled out as "saint". In other words, there is no formal marker marking [-Pl] in the vast majority of English nouns. If [-Pl] is first realized as a morpheme and then it is specified as zero, then two rules must be applied. If [-Pl] is not realized as an affix, then no rules apply, outside of the orthographic rules which must apply in any case. Given no compelling reason to create a suffix here, we adopt the simpler solution based on the LEF.

Before we cover the plural, let us introduce the lexicon and the lexical entry. All words have to be stored somewhere--in the lexicon. The lexicon is a mental dictionary. It includes all information about a word that is not predictable. Predictable information is not included in the lexicon, but elsewhere in the grammar. The basic information includes the phonological form, its part of speech, its morphological form, syntactic information, and semantic information. Here, we will concentrate on the first three components. Let us start with the following lexical entry for saint:

SAINT
/se:nt/	Phonological form
N	Category
	Plural
+	Count
-	Personal
-	Proper

This entry is enhanced by the addition of the features [±Pl] copied from the quantifier in the syntax:

SAINT
/se:nt/	Phonological form
N	Category
+	Plural
+	Count
-	Personal
-	Proper

SAINT
/se:nt/	Phonological form
N	Category
-	Plural
+	Count
-	Personal
-	Proper

We can relate the above diagrams in the following diagram, modified from the above:

The upper level represents the lexical entry of saint; the lower levels represent the basic forms of the two lexical items: saint and saints.The upper level is underspecified. That is, it contains the least number of marked features that are unpredictable.

The plural of the vast majority of English nouns is marked by a suffix. A suffix is a morph that follows (occurs to the right of) the head morpheme. A head morph is one which forms the basic part of the word. To a head are adjoined affixes. The plural marker /z/ ('s') and /in/ ('en') are suffixes; they are adjoined to the nominal. In the above example, /se:nt/ is a nominal head. In the above diagram the grammatical feature [+Pl] must be identified first as a morpheme. We can do this with the following rule that converts that the grammatical feature [+Past] into a morpheme specifically identified as a suffix marked by symbol '+':

Thus {SAINT,+Plural} --> {SAINT}+{[+Pl,+Suf]}. The latter may be abbreviated as {S}. The selection of the symbol 'S' is influenced by English orthography:

The abbreviation is a typological convenience; it is in no way a grammatical form.

Next, those forms marked as [+Suffix] are formally split off from the lexical form; sometimes this rule is called splitting:

Note that the preceding is formally a word. We may mark this with the symbol '##' :

We won't normally use this notation unless it needed for clarity. Of course, we may also represent it in a tree diagram:

The advantage of the above type of tree diagram is that it shows the derivation history of the form.

Next. {S} is assigned a phonological form. This is called "spell-out". The default rule (the regular or 'elsewhere' rule) spelled {S} out as /s/, /z/, or /Iz/ depending on phonological context:

The first condition stipulates that it is spelled out as /Iz/ if it follows a sibilant. The second condition stipulates it is spelled out as /s/ if it follows a voiceless segment (always a consonant in English), and the third condition, the elsewhere rule, spells it out as /z/ elsewhere. Since saint ends with a voiceless consonant, {S} is spelled out as /s/: /se:nts/.

Note that in the singular, the feature [-Pl] was not assigned the subfeatare [+Suffix]. This is because the singular in English is not marked in an overt way. [-Pl] remains associated with the nounstem. There are some exceptions of Latinate origin. These are covered in 'Analysis and Rules of Grammar II.'

There exists a set of orthographic rules which spell out phonemic forms in orthographic form. These rules are very extensive and, as it is well known, subject to many exceptions. We will introduce them one by one as the need arises. The nouns stem /se:nt/ is spelled out by the following rules:

The plural suffix is spelled out by a special morphological rules that spelled out {S} as 's' as a default. (We will cover 'es' later):

/e:/ is spelled out 'ay', but {S} is spelled out as 's', even though it is phonemically voiced:

The fact that we must spell out the morpheme {S} as 's' means that this rule must be applied before convergence. Convergence is a common spelling rule that spells all the graphemes of a word without the interruption of a space. The first step spells out each morpheme, and the second step applies convergence:

There is one very good reason why the spelling-out of plural ending must precede convergence. Note that /de:z/ as a single morpheme is spelled out as 'daze'. If spell out followed convergence, then there would be no way to predict whether /de:z/ would be spelled out as 'days' or 'daze':

POOCH	POOCH	DOG	DOG	Lexeme
/pu:ch/	/pu:ch/	/dag/	/dag/	phon. form
N	N	N	N	Category
[-Pl]	[+Pl]	[-Pl]	[+Pl]	Relevant feature
	[+Pl, +Suf]		[+Pl, +Suf]	Rule: Plural is a suffix.
	POOCH+[+Pl, +Suf]		DOG+[+Pl, +Suf]	Rule: Suffix Formation.
/pu:ch/	/puch+z/	/dog/	/dog/+/z/	Phonemic representation.
	/puch/+es			Rule: Spell-out /+z/ as 'es'.
			/dag/+s	Rule: Spell-out /+z/ as 's'.
pooch	pooch+es	dog	dog+s	Rules: spell out remaining phonemes.
	pooches		dogs	Converge.

A set is usually drawn in Venn diagrams. The word saint consists of two subsets, each of which may be a member if they cannot be broken down into further sets:

Venn diagrams are cumbersome to draw. Representing the above as {SAINT, S} is much less cumbersome. We will constinue to represent saint as {SAINT, S}. Venn diagrams are helpful in picturing certain examples.

In set theory if a set contains two or more subsets and they are not linearally arrange as in the above example, then one subset is selected from the set. In traditional linguistic derivation, and underlying is posited and then a rule is applied to derive the next level (subset): C -> C^h / $ ___'V (a voiceless obstruent becomes voiceless in the context immediately following a syllable boundary and preceding a stressed vowel. In set theory there is no rule changing, deleting or inserting a feature. Rather, a rule determines which subset is selected.

For example, the plural morpheme should be represented as {+Pl} (a set that contains the feature [+Pl]; the square brackets are not written to reduce clutter). {+Pl} is a set that contains /s/, /z/, /ˆz/ and /ˆn/. The first three phonemes (sets) are determined by phonological selection rules. The last one is selected by morpholexical rules, which are irregular. Consider OX; its lexical entry must conain information that the phonemic form /ˆn/ must be selected for the feature [+Pl]. The same holds for CHILD and BROTHER, when it has the meaning of members of a religious or similar type group: OX+[+Pl] => OX+/ˆn/. That is, of the four subsets of [+Pl] in English, /ˆn/ is selected.

The remaining forms are phonological subsets of [+Pl]. How does [+Pl] select /s/? If the lexeme in question ends in a voiceless obstruent that is not a sibilant, then /s/ is selected. This is a phonological contest that is similar to the context of a derivational rule deriving /s/ from an underlying "S".

The main difference here is ow we view processing: a derivational rule or a selectional rule. Selectional rules select a subset from an immediate dominating set. Selectional rules cannot skip intermediate sets. Currently we have the set {+Pl}, whose subsets are /s/, /z/, /ˆz/ and /ˆn/.

The selectional rules for the plural are the following:

The elsewhere condition is the same as in derivational phonology. It is the default rule — the least marked context. Default rules are common in computational theory. The most marked or irregular rules occur first, the lesser marked rules follow in descending order according to the degree of markedness.

1. Because non-standard fonts cannot be used universally on the web, we use the following system to represent English vocalic phonemes:

a: = æ = low front vowel (technically a tense vowel). If /æ/ doesn't work, we will use /a:/ to represent the low tense vowel.

/se:nt/	"saint"
/s/	"plural"