Computing Declarative Prosodic Morphology

6m ago
31 Views
0 Downloads
793.12 KB
10 Pages
Transcription

Computing Declarative Prosodic MorphologyMarkus WaltherS e m i n a r ftir A l l g e m e i n e SprachwissenschaftHeinrich-Heine-Universit it DtisseldorfUniversit itsstr. 1, D-40225 Dtisseldorf, G e r m a n ywal [email protected], u n i - d u e s s e l d o r f. deAbstractThis paper describes a computational, declarativeapproach to prosodic morphology that uses inviolable constraints to denote small finite candidate setswhich are filtered by a restrictive incremental optimization mechanism. The new approach is illustratedwith an implemented fragment of Modern Hebrewverbs couched in MicroCUF, an expressive constraint logic formalism. For generation and parsingof word forms, I propose a novel off-line technique toeliminate run-time optimization. It produces a finitestate oracle that efficiently restricts the constraint interpreter's search space. As a byproduct, unknownwords can be analyzed without special mechanisms.Unlike pure finite-state transducer approaches, thishybrid setup allows for more expressivity in constraints to specify e.g. token identity for reduplication or arithmetic constraints for phonetics.1IntroductionProsodic morphology (PM) circumscribes a numberof phenomena ranging from 'nonconatenative' rootand-pattern morphology over infixation to variouscases of reduplication, where the phonology stronglyinfluences the shape of words by way of obedienceto structural constraints defining wellformed morae,syllables, feet etc. These phenomena have been difficult to handle in earlier rule-based treatments (Sproat1992, 159 ft.). Moreover, as early as Kisseberth(1970) authors have noted that derivational accountsof PM are bound to miss important linguistic generalizations that are best expressed via constraints.Kisseberth showed that verb stems in Tonkawa, aCoahuiltecan language, display a complex V/ alternation pattern when various affixes are added (fig.1). This leads to more and more complicated voweldeletion rules as the fragment is enlarged. In contrast,a straightforward constraint that bans three consecutive consonants offers a unified account of the conditions under which vowels must surface. Later devel-'to cut'picn-o?we-pcen-o?picna-n-o?p(i)c(e)n(a)"to lick'netl-o?( 3sg.obj.stem-3sg.subj. )we-ntal-o?(3pl.obj.-stem-3sg.subj. )netle-n-o? ( 3sg.obj.stem-prog.-3sg.subj. )n(e)t(a)l(e)stemsFigure 1: Tonkawa verb forms with V/ effectsopments have refined constraints such as ,CCC to refer to syllable structure instead: complex codas andonsets are disallowed. At least since Kahn (1976),Selkirk (1982), such segment-independent referenceto syllable structure has been standardly assumed inthe generative literature.Astonishing as it may be, even the latest computational models of PM phenomena apparently eschewthe incorporation of real prosodic representations,syllabification and constraints. Kiraz (1996) usesmulti- tape two-level morphology to analyze someArabic data, but - despite the suggestive title - mustsimulate prosodic operations such as 'add a mora'by their extensionalized rule counterparts, which refer to C or V segments instead of moras. There is noon-line syllabification and the exclusive use of lexically prespecified syllable-like symbols on a separatetemplatic pattern tape renders his approach vulnerable to postlexical resyllabification effects. Similarly,Beesley (1996) seems content in employing a greatnumber of CV templates in his large-scale finite-statemodel of Arabic morphology, which are intersectedwith lexical roots and then transformed to surface realizations by various epenthesis, deletion and assimilation rules. Beesley states that further applicationof his approach to e.g. Hebrew is foreseen. On thedownside, however, again there is no real prosody inhis model; the relationship between template formand prosody is not captured.Optimality Theory (OT, Pnnce & Smolensky1993), as applied to PM (McCarthy & Prince 1993),does claim to capture this relationship, using a11

ranked set of violable prosodic constraints togetherwith global violation minimization. However, to datethere exist no sufficiently formalized analyses ofnontrivial PM fragments that could be turned intotestable computational models. The OT frameworkitself has been shown to be expressible with weightedfinite-state automata, weighted intersection and bestpath algorithms (Ellison 1994) if constraints andOT's GEN component - the function from underlying forms to prosodified surface forms - are regular sets. A recent proposal by Karttunen (1998)dispenses with the weights while still relying onthe same regularity assumption. Published PM analyses, however, frequently make use of constraint parametrizations from the ALIGN family, which requires greater than regular power (Ellison 1995).Further developments of OT such as correspondencetheory - extensively used in much newer work onPM - have not received a formal analysis so far. Finally, although OT postulates that constraints are universal, this metaconstraint has been violated from theoutset, e.g. in presenting Tagalog -um- as a languagespecific parameter to ALIGN in Pnnce & Smolensky (1993). Due to the convincing presentation ofa number of other forceful arguments against constraint universality in Ellison (to appear), the case forlanguage-specific constraints must clearly be seen asreopened, and - as a corollary - the case for constraint inviolability as well.Declarative Phonology (DP, Bird 1995, Scobbie1991 ) is just such a constraint-based framework thatdispenses with violability and requires a monostratalconception of phonological grammar, as comparedto the multi-level approaches discussed above. Bothabstract generalizations and concrete morphemes areexpressed by constraints. DP requires analyses tobe formally adequate, i.e. use a grammar description language with formal syntax and semantics. Asa consequence, Chomsky's c r t e d a for a generativegrammar which must be "perfectly explicit" and "notrely on the intelligence of the understanding reader"(Chomsky 1965, 4) are automatically fulfilled. DPthus appears to be a good starting point for a restrictive, surface-true theory of PM that is explicitly computational.The rest of this paper reviews in informal terms thetheory of Walther (1997) (section 2), showing in formal detail in section 3 how to implement a concreteanalysis of Modern Hebrew verbs. Section 4 explainsa novel approach to both generation and parsing ofword forms under the new theory. The paper concludes in section 5.2Declarative Prosodic MorphologyFocussing on cases of 'nonconcatenative' root-andpattern morphology, Declarative Prosodic Morphology (DPM) starts with an intuition that is opposite towhat the traditional idea of templates or fixed phonological shapes (McCarthy 1979) suggests, namelythat shape variance is actually quite common andshould form the analytical basis for theoretical accounts of PM. Besides the Tonkawa case (fig.l),shape variance is also at work in Modern Hebrew(MH) inflected verb forms (Glinert 1989), see fig.2.1 Here we see a systematic V/O alternation of bothl uji-gmer-uFigure 2: Modern Hebrew x f g . m . r nish' (B1)stem vowels, depending on the affixation pattern.This results in three stem shapes CVCVC, CVCC andCCVC. Any analysis that simply stipulates shape selection on the basis of specific inflectional categoriesor phonological context (e.g. 3sg.f V 3pl or -V .- CVCCstem / B 1 past) misses the fact that the shapes,their alternating behaviour and their proper selection are derivable. Derivational repairs by means of'doubly open syllable' syncope rules (/ga.ma.r-a.//.gam.ra./) are similarly ad hoc. A first step in developing an alternative DPManalysis of MH verbs is to explicitly recognize alternation of an element X with zero - informallywritten (X) - as a serious formal device besidesits function as a piece of merely descriptive notation (cf. Hudson 1986 for an earlier applicationto Arabic). In contrast to nonmonotonic deletionor epenthesis, (X) is a surface-true declarative expression (Bird 1995, 93f.). The reader is remindedtRegular MH verbs are traditionally divided into seven verbal classes or binyanim, B I-B7. Except for B4 and B6, whichregularly act as passive counterparts of B3 and B4, the semanticcontribution of each class is no longer transparent in the modemlanguage. Also, in many cases the root (written /'C .C .Cs) isrestricted to an idiosyncratic subset of the binyanim.An a-templatic treatment of MH prosodic morphology wasfirst proposed by Bat-El (1989, 40ff.) within an unformalized,non-surface-tree, non-constraint-based setting.12

that DP sees grammar expressions as partial formal descriptions of sets of phonological objects. Theformer reside on a different ontological level fromthe latter, in contrast to traditional object-to-objecttransformations on the same level. Hence a preliminary grammar expression g(V1)m(V2)r for a Hebrew stem (with abstract stem vowels) denotes theset {gmr, gVlmr, gmV2r, gVlmV2r). Note that the(X) property as attributed to segmental positions isdistinctive - in contrast to stem vowels root segments do not normally alternate with zero, and neither do affix segments in an important asymmetrywith stems. This point is reinforced by the exceptionsthat do exist: phonologically unpredictable C/ alternation occurs in some MH stems, e.g. natan/lakax'he gave/took' vsfi-ten/ji-kax 'he will give/take'; bysurface-true (n/l) encoding we can avoid diacriticalsolutions here. Step two uses concatenation to combine individual descriptions of stems and affixes, besides connecting segmental positions within these linguisticentities. Since, as we have just seen, a single description can denote several objects of varying surface stnng length, concatenation ( ) at the descriptionlevel is actually powerful enough to describe 'nonconcatenative' morphological phenomena. In DPMthese do not receive independent ontological status(cf. Bird & Klein 1990 and Gafos 1995 for other formal and articulatory-phonological arguments leadingto the same conclusion). A more detailed descriptionof the 3pl.fut. inflected form of x g.m.r might therefore be j i g'(V1) m (V2)r u. In order to allow forparadigmatic 2 generalizations over independent entities such as root and stem vowel pattern within concatenated descriptions, a hierarchical lexicon conception based on multiple inheritance of named abstractions can be used (cf. Riehemann 1993). Step three conjoins a word form description withdeclarative syllabification and syllable structureconstraints in order to impose prosodic wellformedness conditions. For Modem Hebrew (andTonkawa), the syllable canon is basically CV(C).Expressed in prosodic terms, complex codas andonsets are banned, while an onset must precede eachsyllable nucleus. These syllable roles are establishedin the first place by syllabification constraints thatexploit local sonority differences between successivesegments (Walther 1993). Alltogether, the ensembleof prosodic Constraints indeed succeeds in narrowing down the set for the 3sg.m past tense formto {*.9mr., *.9amr., *.9mar., !.9a.mar.} /gamar/. For 3pl. future tense B1, however, an unresolved ambiguity remains: in{.jig.me.ru.,.ji.gam.ru.}, only the first element is grammatical. 3 An important observation isthat in general there can be no purely phonologicalconstraint to disambiguate this type of situation.The reason lies in the existence of minimal pairswith different category. In our case, homophonous/.ji.gam.ru./ is grammatical as 3pl. fut. B2 'theywill be finished'. We will return to the analysis ofsuch cases after proposing a specific disambiguationmechanism in the next step. Step four eliminates the remaining ambiguityby invoking an Incremental Optimization Principle (IOP): "For all (X) elements, prefer the zero altemant as early as possible". "Early" corresponds totraditional left-to-right directionality, but is meant tobe understood w.r.t, the speech production time arrow. "As possible" means that IOP application toa (X) position nevertheless realizes X if its omission would lead to a constraint conflict. Hence,the IOP correctly rules out the second elementof {.jig.me.ru.,*.ji.9ara.ru.}. This is because.ji.gam.ru. represents a missed chance to leave out/a/, the earlier one of the two stem vowels. The readermay verify that the IOP as it stands also accountsfor the Tonkawa data of fig. I. Tonkawa lends evenclearer support to IOP's left-to-right nature due to thelarger number of V/O vowels involved. As a limiting case, the IOP predicts the possibility of vowelless surface stems, e.g. formed by two root consonants combined with vowel-final prefix and suffix.This prediction is strikingly confirmed by MH formslike te-lx-i 'you (sg.f.) will go' /(h).l.x, ti-kn-u'you/they (pl.) will buy' /'k.n.O, ti-tn-i 'you (sg.f.)will give' /(n).t.n; similar cases exist in Tigdnya.There can be no meaningful prosodic characterization of isolated CC stem shapes; only a wordformbased theory like the present one may explain whythese forms exist.Note that, conceptually, IOP is piggybacked on autonomous DP-style constraint interaction. It merelyfilters the small finite set of objects described bythe conjunction of all constraints. From another angle, IOP can be seen as a single context-free sub-2See Walther (1997) for a discussion of various ways to derive rather than stipulate the syntagmatic pattern of alternatingand non-alternatingsegmental positions within stems.3Note that the prosodic view explains the pronounced influence of (C)V affixes on the shape of the whole word: they provide a nonalternating syllable nucleus which can host adjacentstem consonants.13

stitute for the various syncope rules employed informer transformational analyses. The claim is thatfixed-directionality-IOP is the only such mechanismneeded to account for PM phenomena.A dist