CATEGORIAL GRAMMAR MARK STEEDMAN

5m ago
34 Views
0 Downloads
209.14 KB
30 Pages
Transcription

CATEGORIAL GRAMMARM ARK S TEEDMANUniversity of EdinburghABSTRACTCategorial Grammar comprises a family of lexicalized theories ofgrammar characterized by very tight coupling of syntactic derivation andsemantic composition, having their origin in the work of Frege. Some versions of CG have extremely restricted expressive power, corresponding tothe smallest known natural family of formal languages that properly includes the context-free. Nevertheless, they are also strongly adequate tothe capture of a wide range of cross-linguistically attested non-contextfree constructions. For these reasons, categorial grammars have been quitewidely applied, not only to linguistic analysis of challenging phenomenasuch as coordination and unbounded dependency, but to computational linguistics and psycholinguistic modeling.1. I NTRODUCTION . Categorial Grammar (CG) is a “strictly” lexicalized theory ofnatural language grammar, in which the linear order of constituents and their interpretation in the sentences of a language are entirely defined by the lexical entries for thewords that compose them, while a language-independent universal set of rules projectsthe lexicon onto the strings and corresponding meanings of the language. Many ofthe key features of Categorial Grammar have over the years been assimilated by othertheoretical syntactic frameworks. In particular, there are recent signs of convergencefrom the Minimalist Program within the transformational generative tradition (Chomsky 1995; Berwick and Epstein 1995; Cormack and Smith 2005; Boeckx 2008:250).Categorial grammars are widely used in various slightly different forms discussedbelow by linguists interested in the relation between semantics and syntactic derivation. Among them are computational linguists who for reasons of efficiency in practicalapplications wish to keep that coupling as simple and direct as possible. Categorialgrammars have been applied to the syntactic and semantic analysis of a wide varietyof constructions, including those involving unbounded dependencies, in a wide varietyof languages (e.g. Moortgat 1988b; Steele 1990; Whitelock 1991; Morrill and Solias1993; Hoffman 1995; Nishida 1996; Kang 1995, 2002; Bozşahin 1998, 2002; Komagata 1999; Baldridge 1998, 2002; Trechsel 2000; Cha and Lee 2000; Park and Cho2000; Çakıcı 2005, 2009; Ruangrajitpakorn et al. 2009; Bittner 2011, 2014; Kubota2010; Lee and Tonhauser 2010; Bekki 2010; Tse and Curran 2010).Categorial grammar is generally regarded as having its origin in Frege’s remarkable1879 Begriffsschrift, which proposed and formalized the language that we now know asfirst-order predicate logic (FOPL) as a Leibnizian calculus in terms of the combinationof functions and arguments, thereby laying the foundations of all modern logics andprogramming languages, and opening up the possibility that natural language grammarcould be thought of in the same way. This possibility was investigated in its syntactic and computational aspect for small fragments of natural language by Ajdukiewicz(1935) (who provided the basis for the modern notations), Bar-Hillel (1953) and BarHillel et al. (1964) (who gave categorial grammar its name), and Lambek (1958) (whoinitiated the type-logical interpretation of CG).1

2D R A F T1.0,F EBRUARY 9, 2014It was soon recognized that these original categorial grammars were context-free(Lyons 1968), and therefore unlikely to be adequately expressive for natural languages(Chomsky 1957), because of the existence of unbounded or otherwise “long range”syntactic and semantic dependencies between elements such as those italicized in thefollowing examples:1(1) a. These are the songs they say that the Syrens sang.b. The Syrens sang and say that they wrote these songs.c. Some Syren said that she had written each song.( / )d. Every Syren thinks that the sailors heard her.Frege’s challenge was taken up in categorial terms by Geach (1970) (initiating thecombinatory generalization of CG), and Montague (1970b) (initiating direct compositionality, both discussed below) In particular, Montague (1973) influentially developedthe first substantial categorial fragment of English combining syntactic analysis (using a version of Ajdukiewicz’ notation) with semantic composition in the tradition ofFrege (using Church’s λ -calculus as a “glue language” to formalize the compositionalprocess).In the latter paper, Montague used a non-monotonic operation expressed in terms ofstructural change to accomodate long range dependencies involved in quantifier scopealternation and pronoun-binding, illustrated in (1c,d). However, in 1970b he had laidout a more ambitious program, according to which the relation between syntax andsemantics in all natural languages would be strictly homomorphic, like the syntax andsemantics in the model theory for a mathematical, logical, or programming language,in the spirit of Frege’s original program.For example, the standard model theory for the language of first-order predicate logic(FOPL) has a small context-free set of syntactic rules, recursively defining the structureof negated, conjunctive, quantified, etc. clauses in terms of operators , , x etc. andtheir arguments. The semantic component then consists of a set of rules paired oneto-one with the syntactic rules, compositionally defining truth of an expression of thatsyntactic type solely in terms of truth of the arguments of the operator in question (seeRobinson 1974).Two observations are in order when seeking to generalize such Fregean systems asFOPL to human language. One is that the mechanism whereby an operator such as x“binds” a variable x in a term of the form x[P] is not usually considered part of thesyntax of the logic. If it is treated syntactically, as has on occasion been proposed forprogramming languages (Aho 1968), then the syntax is in general no longer contextfree.2The second obervation is that the syntactic structures of FOPL can be thought ofin two distinct ways. One is as the syntax of the logic itself, and the other is as aderivational structure describing a process by which an interpretations has been constructed. The most obvious context-free derivational structures are isomorphic to thelogical syntax, such as those which apply its rules directly to the analysis of the string,either bottom-up or top-down. However, even for a context-free grammar, derivationstructure may be determined by a different “covering” syntax, such as a “normal form”1These constructions in English were shown by Gazdar (1981) to be coverable with only context-freeresources in Generalized Phrase Structure Grammar (GPSG), whose “slash” notation for capturing such dependencies is derived from but not equivalent to the categorial notation developed below. However, Huybregts(1984) and Shieber (1985) proved Chomsky’s widely accepted conjecture that in general such dependenciesrequire greater than CF expressive power.2This observation might be relevant to the analysis of “bound variable” pronouns like that in (1d).

CATEGORIAL GRAMMAR3grammar. (Such covering grammars are sometimes used for compiling programminglanguages, for reasons such as memory efficiency.) Such covering derivations are irrelevant to interpretation, and do not count as a representational level of the languageitself. In considering different notions of structure involved in theories of natural language grammar, it is important to be clear whether one is talking about logical syntaxor derivational structure.Recent work in categorial grammar has built on the Fregeo-Montagovian foundationin two distinct directions, neither of which is entirely true to its origins. One group ofresearchers has made its main priority capturing the semantics of diverse constructionsin natural languages using standard logics, often replacing Montague’s structurally nonmonotone “quantifying in” operation by more obviously compositional rules or memory storage devices. Its members have tended to either remain agnostic as to the syntactic operations involved or assume some linguistically-endorsed syntactic theory such astransformational grammar or GPSG (e.g. Partee 1975; Cooper 1983; Szabolcsi 1997;Jacobson 1999; Heim and Kratzer 1998), sometimes using extended notions of scopewithin otherwise standard logics (e.g. Kamp and Reyle 1993; Groenendijk and Stokhof1991; Ciardelli and Roelofsen 2011), or tolerating a certain increase in complexity inthe form of otherwise syntactically or semantically unmotivated surface-compositionalsyntactic operators or type-changing rules on the syntactic side (e.g. Bach 1979; Dowty1982; Hoeksema and Janda 1988; Jacobson 1992; Hendriks 1993; Barker 2002) and theLambek tradition (e.g. Lambek 1958, 2001; van Benthem 1983, 1986; Moortgat 1988a;Oehrle 1988; Morrill 1994; Carpenter 1995; Bernardi 2002; Casadio 2001; Moot 2002;Grefenstette et al. 2011).Other post-Montagovian approaches have sought to reduce syntactic complexity, atthe expense of expelling some apparently semantic phenomena from the logical language entirely, particularly quantifier scope alternation and pronominal binding, relegating them to offline specification of scopally underspecified logical forms (e.g. Kempson and Cormack 1981; Reyle 1993; Poesio 1995; Koller and Thater 2006; Pollard1984), or extragrammatical discourse reference (e.g. Webber 1978; Bosch 1983)One reason for this diversity and divergence within the broad church of categorialgrammar is that the long-range and/or unbounded dependencies exemplified in (1)above, which provide the central challenge for any theory of grammar and for theFregean approach in particular, fall into three distinct groups. Relativization, topicalization, and right node-raising are clearly unbounded and clearly syntactic, being subjectto strong island constraints, such as the “fixed subject constraint”, as in (2a).(2) a. #This is the Syren they wonder whether sang a song.b. Some Syren claimed that each song was the best.( /# )c. Every Syren claimed that some song was the best.( / )d. Every Syren thinks that her song is the best.On the other hand, the binding of pronouns and other nominals as dependents of quantifiers is equally clearly completely insensitive to islands, as in (2d), while quantifierscope inversion is a mixed phenomenon, with the universals every and each apparentlyunable to invert scope out of embedded subject positions, as in (2b), while the existentials can do so (2c).The linguistic literature in general is conflicted on the precise details of what speciesof dependency and scope is allowed where. However, there is general agreement thatwhile syntactic long-range dependencies are mostly nested, and the occasions whencrossing dependencies are allowed are very narrowly specified syntactically, intrasentential binding of pronouns and dependent existentials is essentially free within the

4D R A F T1.0,F EBRUARY 9, 2014scope of the operator. For example, crossing and nesting binding dependencies in thefollowing seem equally good:(3) Every sailori knows that every Syren j thinks she j /hei saw himi /her j .It follows that those researchers whose primary concern is with pronoun binding insemantics tend to define their Fregean theory of grammar in terms of different setsof combinatory operators from those researchers whose primary concern is syntacticdependency. Thus, not all categorial theories discussed below are commensurable.2. P URE C ATEGORIAL G RAMMARS . In all varieties of Categorial Grammar, elements like verbs are associated with a syntactic “category” which identifies them asFregean functions, and specifies the type and directionality of their arguments and thetype of their result. We here use the “result leftmost” notation in which a rightwardcombining functor over a domain β into a range α are written α/β , while the corresponding leftward-combining functor is written α\β .3α and β may themselves be function categories. For example, a transitive verb is afunction from (object) NPs into predicates—that is, into functions from (subject) NPsinto S:(4) likes : (S\NP)/NPAll varieties of categorial grammar also include the following rules for combiningforward- and backward–looking functions with their arguments:(5) Forward Application: ( )X/Y Y X(6) Backward Application: ( )Y X\Y XThese rules have the form of very general binary phrase-structure rule schemata. Infact, pure categorial grammar is just context-free grammar written in the accepting,rather than the producing, direction, with a consequent transfer of the major burdenof specifying particular grammars from the PS rules to the lexicon. While it is nowconvenient to write derivations as in a, below, they are equivalent to conventional phrasestructure derivations b:(7) a. Marylikesbureaucracy b. Mary likes bureaucracyNPVNPNP (S\NP)/NPNPS\NP VP SSIt is important to note that such tree-structures are simply a representation of the processof derivation. They do not necessarily constitute a level of representation in the formalgrammar.CG categories can be regarded as encoding the semantic type of their translation,and this translation can be made explicit in the following expanded notation, whichassociates a logical form with the entire syntactic category, via the colon operator, whichis assumed to have lower precedence than the categorial slash operators. (Agreement3There is an alternative “result on top” notation due to Lambek (1958), according to which the lattercategory is written β \α. Lambek’s notation has advantages of readability in the context-free case, becauseall application is adjacent cancellation. However, this advantage does not hold for trans–context-free theorieswhich include non-Lambek operators such as crossed composition. For such grammars, and for any analysisin which the semantics has to be kept track of, the Lambek notation is confusing, because it does not assigna consistent left-right position to the result α vs. the argument β .

CATEGORIAL GRAMMAR5features are also included