1 - Reinhard Blutner
Transcrição
1 - Reinhard Blutner
Optimalitätstheorie und lokale Kohärenz Reinhard Blutner, Berlin www.blutner.de/ot [email protected] Die ursprünglich in der Phonologie entwickelte Optimalitätstheorie (Prince & Smolensky 1993) hat in den letzen Jahren auch Anwendungen in Morphologie, Syntax, Semantik und Pragmatik gewonnen. Im Vortrag soll diese Theorie auf das Problem der Anapherninterpretation im Diskurskontext angewendet werden. Der erste Teil des Vortrags konzentriert sich auf das "Centering"-Modell der lokalen Kohärenz von Brennan, Friedman & Pollard (1987), und ich zeige, daß eine optimalitätstheoretische Rekonstruktion dieses Modells möglich ist (vgl. Beaver, im Druck). Ein genereller Vorzug der optimalitätstheoretischen Analyse ist ihr deklarativer Stil, der es erlaubt, den ursprünglich für die Interpretation entwickelten Ansatz auch für die Generierung zu verwenden. Im zweiten Teil des Vortrages sollen einige Modifikationen und Verbesserungen am Basismodell vorgenommen werden. Ziel dieser Etüde ist es, einige der Probleme zu lösen, die im Zusammenhang mit fokussierten Pronomen stehen. Der Vortrag versucht nicht, die Vielfalt der Rätsel zur lokalen Kohärenz auch nur andeutungsweise zu thematisieren. Vielmehr wähle ich dieses Beispiel als glücklichen Illustrationsfall, um Grundideen der Optimalitätstheorie zu illustrieren und gleichzeitig Argumente dafür zu gewinnen, daß dieses Rahmenmodell ein produktives Instrument in der Theorienbildung sein kann und wichtige Querverbindungen herstellt (Produktion-Generierung, Kompetenz-Performanz, Erwerbsproblematik), die das allgemeine Interesse des Kognitionswissenschaftlers beanspruchen dürfen. 1 Einführendes Beispiel 2 Architektur der Optimalitätstheorie - OT als integrative Theorie - OT als Instrument der Modellbildung 3 Centering-Modell der lokalen Kohärenz - BFN-Modell - Optimalitätstheoretische Rekonstruktion - Vorzüge des deklarativen Stils 4 Centering und Fokus 5 Centering und Generierung 6 Akzeptabilität und Scrambling 2 1 Ethics for robots Isaac Asimov described what became the most famous view of ethical rules for robot behaviour in his “three laws of robotics” (Thanks to Bart Geurts for drawing my attention to this example): Three Laws of Robotics: 1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm. 2. A robot must obey the orders given it by human beings, except where such orders would conflict with the First Law. 3. A robot must protect its own existence, as long as such protection does not conflict with the First or Second Law. (Asimov, Isaac: I, Robot. Gnome Press 1950) This sentence actually contains three independent constraints: 1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm. 2. A robot must obey the orders given it by human beings. 3. A robot must protect its own existence. From an optimality theory point of view, we can think of this as three constraints, where each one overrides the subsequent. The effect of overriding is described by a ranking of the constraints: 1 o 2 o 3, i.e.: *INJURE HUMAN o OBEY ORDER o PROTECT EXISTENCE 3 Story A: Human says to Robot: Kill my wife! 1. R kills H’s wife 2. R kills H (who gave him the order) 3. R doesn’t kill anyone 4. R kills himself. Standard optimality tableau (L marks the optimal candidate, "*!" the fatal constraint violation): TABLEAU FOR STORY A *INJURE HUMAN 1. R kills H’s wife *! 2. R kills H *! L 3. R doesn’t kill anyone 4. R kills himself OBEY ORDER PROTECT EXISTENCE * * * *! In the example, the story relates to a certain situation type that generates the possible reactions 1-4. Furthermore, the extension of the constraints is relative to this situation type. R’s optimal reaction to H’s order is to do nothing (line 3). All other reactions are suboptimal. 4 Story B: Human says to Robot: Kill my wife or I kill her! TABLEAU FOR STORY B L 1. R kills H’s wife *INJURE HUMAN OBEY ORDER PROTECT EXISTENCE * 2. R kills H * * 3. R doesn’t kill anyone * * 4. R kills himself * * * R’s optimal reaction to H’s order is to kill H’s wife. Story C: Human says to Robot: Kill my wife or I destroy you! TABLEAU FOR Story C *INJURE HUMAN 1. R kills H’s wife * 2. R kills H * OBEY ORDER PROTECT EXISTENCE * L 3. R doesn’t kill anyone * * L 4. R kills himself * * There are two optimal reaction to H’s order: R does nothing (then he is killed by H), or he kills himself. 5 2 Basic architecture of OT The GENerator determines the possible inputs, the possible outputs, and the possible correspondences between inputs and outputs. For a given input, GEN creates a candidate set of possible outputs. OT doesn’t provide a ‘theory’ for GEN, rather it presupposes it. (OT is not a theory of representations!) The universal CONstraint set is assumed to be part of our innate knowledge of language. Each constraint can be seen as a markedness statement. Constraints can be ranked. This reflects the relative importance of the different markedness statements for a particular language. EVALuation us the mechanism which selects the optimal candidate(s) from the candidate set generated by GEN. EVAL makes use of the ranking of the violable constraints. The optimal output, the one that is selected by EVAL is the one that best satisfies these constraints. 6 General Remarks *OT first developed in phonology, later syntax, semantics, interpretation *Basic idea: o violable (universal) constraints. o language-particular ranking of the constraints. *Conceptual advantage: integrative framework o general theory of Grammar and perhaps several other cognitive domains (discourse interpretation, vision, music). o interesting learning theory o overcomes the competence-performance gap o combinatorial typology o founded in harmony theory (connectionism) *main problems: o often unmotivated stipulation of constraints o for semantics/pragmatics: stipulation of a universal ranking of the constraint 7 3 Centering A framework for modelling the local coherence of discourse *Grosz, Joshi, & Weinstein (1983, 1995) *Brennan, Friedman, & Pollard (1987) [BFN] *Kameyama (1994, 1998) *Strube (1996, 1998) (1) a. Terry really goofs sometimes. b. Yesterday was a beautiful day and he was excited about trying out his new sailboat. c. He wanted Tony to join him at a sailing expedition. d. He called him at 6 PM. e. He was sick and furious at being woken up so early. Grosz, Joshi, & Weinstein (1995) (2) ... Die Hexe sagte noch, dass Hänsel und Gretel Brot in einen Ofen tuen sollten. Hänsel wurde in einen Käfig gesteckt, nachdem sie sich geweigert hatten es zu tun. Gretel musste alles für die Hexe tuen. Als Gretel wieder das Brot in den Ofen tuen sollte sagte sie der Hexe dass sie es ihr erst einmal zeigen sollte, wie man so etwas macht. Sie tat es, aber Gretel schob sie tiefer hinein und verschloss die Ofentür mit einem Riegel. Sie konnte jetzt nicht mehr heraus. Das einzige was sie tun konnte ist verbrennen. Nun befreite Gretel Hänsel aus dem Käfig. Sie nahmen Gold und Edelsteine der Hexe mit und fanden den Weg nach Hause. Sie gingen in das ihr Haus und erzählten was sie erlebt hatten. Als sie ihren Eltern zum Schluss noch die Funde gaben waren sie alle glücklich. Und wenn sie nicht gestorben sind, dann leben sie noch heute! 8 3.1 The general idea of modelling centering *Each utterance Ui defines a transition between an input context Ci-1 and an output context Ci. A context C is a dynamically evolving cognitive information state. *A component in context C is the centering state AttCen consisting of a set of propositions with associated entities. The entities in AttCen are partially ordered by salience. That is represented by the forward looking center list Cf . *Attentional state is related to ease of inference: certain inferences associated with salient entities are made more easily than comparable inferences unrelated to salient entities. *One of the entities in Cf may be the backward looking center Cb, or the center (sometimes called the topic), the central entity that the discourse is currently about. U1 Cf Cb C0 < > U2 C1 <ij> U3 C2 <ijk> C3 <ijl> Definition of the Center? Definition of the salience ordering? 9 3.2 The BFN Model *The Center of Ci is the highest ranked entity in Cf i-1 that is part of Cf i *Other Filters: (i) Syntactic binding constraints; (ii) if there are pronouns in Ui, then one of them refers to Cb i (Rule 1) *GF Order: Subject > D.Object > I.Object > Obl predicts the relative salience of entities in the output attentional state Cf i *The way in which the AttCen changes may be classified into a small number of transition types. Two properties are crucial for the classification: COHERE: The center doesn’t change ALIGN: The center aligns to the most salient entity in Cf COHERE ALIGN * Continue (3) * Retain Smooth-shift Jane1 likes Mary2 She1 often brings her2 flowers3 She1/2 chats with her2/1 for ages * * Rough-shift <12> <123> <12>/<21> continue ™ retain; i.e. she=Jane (4) Jane1 is happy She1 was congratulated by Freda2 and Mary3 asked her1/2 a question4 <1> <12> < 3 1 4> / < 3 2 4> retain ™ smooth-shift; i.e. her=Jane *Continue ™ Retain ™ Smooth-shift ™ Rough-shift (Rule 2) 10 3.3 Beaver’s reformulation of centering in OT *Continue ™ Retain ™ Smooth-shift ™ Rough-shift can be reconstructed by assuming COHERE o ALIGN COHERE ALIGN * Continue * Retain * * Rough-shift Smooth-shift *Top-ranked constraints BIND (and AGREE) *AGREE o BIND o PRO/TOP o COHERE o ALIGN (1) * ALIGN COHERE Mary3 asked herx0{1, 2} a question4 PRO/TOP <1> <12> < 3 1 4> < 3 2 4> < 3 1 4> ALIGN * Jane1 is happy She1 was congratulated by Freda2 ) COHERE <12> <21> <21> BINDING AGREE (2) PRO/TOP ) <12> <123> BINDING AGREE Jane1 likes Mary2 She1 often brings her2 flowers3 Shex0{1, 2} chats with her y0{1, 2} for ages * * * * * 11 3.4 Some problems Ø Kameyama (1998) p. 4 (5) John1 went to Jim2’s party <12> He2 was pleased to see John1 again < 2 1 > <1>/<2 > He1/2 had just recovered from a stressful week at work. continue ™ smooth-shift; i.e. he: John1 ™ Jim2 Intuitively, he: Jim2 ™ John1 í Werner Frey (p.c.) (6) Jane1 is happy Mary2 gave her1 a present3 She1/2 smiled <1> <213> <1>/<2> continue ™ smooth-shift; i.e. she: 1 ™ 2 (7) Jane1 is happy Mary2 gave her1 a present3 She1/2 smiled at her2/1 (1 ™— 2) <1> <213> <12>/<2 1> smooth-shift ™ rough-shift; i.e. she: 2 ™ 1 (1 ™— 2) î Kameyama (1998), p.6 (8) Barbar1 went to a bakery2’ He1 greeted the baker2 He1/2 pointed at a blueberry pie3 he: 1 ™ 2 < 1 2’ > <12> <13>/<23> (9) Barbar1 went to a bakery2’ The baker2 greeted him1 He1/2 pointed at a blueberry pie3 he: 1 ™— 2 < 1 2’ > <21> <13>/<23> 12 3.5 An alternative model (inspired by Strube) *Cf i determined by merging several hierarchies o GF: Subject > Object > Object2 > Others o Linear Precedence o EXP order: ZeroPronominal > Pronoun > Def-NP > ProperName > Indef-NP *The different transition types in centering are not significant for predicting preferences. SALIENT ANTECEDENT instead of COHERE. *Incremental processing (word by word) *Readers or listeners can be misled or ‘quoted up the garden path’ by locally ambiguous sentences. Garden-path effects are predicted if optimal resolutions (corresponding to some early input) cannot be extended (3) John1 went to Jim2’s party <12> He <12> was pleased to see John1 GPE!! again < 2 1 > He1/2 had just recovered from a stressful week at work. <1>/<2 > AGREE o BIND o PRO/TOP o SAL ANT o ALIGN 13 4 Centering and Focus 4.1 The complementary preference hypothesis (10) a. Paul called Jim a Republican. Then he insulted him. (Paul insulted Jim) b. Paul called Jim a Republican. Then HE insulted HIM. (Jim insulted Paul) (Lakoff 1971) Complementary Preference Hypothesis A focused pronoun takes the complementary preference of the unstressed counterpart. (Kameyama 1994) (11) Paul called Jane a Republican. Then SHE insulted HIM. (Prince 1981) (12) Paul insulted Jane. Then she HIT him. (de Hoop 2001) De Hoop’s conclusion: The stress on the pronouns seems to be the result of the focus structure of the sentence, rather a shift in preferred reference. (Consequently, the complementary preference hypothesis is irrelevant, at best.) (13) Barbar went to a bakery. He greeted the baker. He / ??HE pointed at a blueberry pie. (Kameyama 1994) (14) Barbar went to a bakery. He pointed at the baker. Then he / HE pointed at a blueberry pie. (J.Mattausch, p.c. 2001) 14 4.2 Motivating two new constraints The AB theory of presupposition projection (van der Sandt 1992, Geurts 1995) *Presupposition inducers introduce an element of structural underspecification into the DRS. *A presupposition may be bound or accommodated in any DRS that subordinates the DRS in which it originates (the semantics leaves open where accommodation/binding occurs!) *The projection process is restricted by general preferences (15) a. If Peter has a dog, then his cat is gray (global) b. If Peter has a cat, then his cat is gray (intermediate) c. GenØ ([p | q/r])={[r, [p|q]], [[p, r]|q], [p|[r, q]] global interm. local Preferences according to van der Sandt/Geurts (i) If a presupposition can both be bound or accommodated, there will in general be a preference for the first option, and (ii) If a presupposition can be accommodated at two different sites, one of which is subordinate to the other, the higher site will, ceteris paribus, be preferred. Constraints in OT **ACC: Avoid Accommodation. It counts the number of discourse referents that are involved in accommodation. *STRONG: Be Strong. It evaluates i/o pairs with stronger outputs higher than pairs with weaker ones. **ACC » STRONG 15 Open Questions *The present formulation of *ACC accounts for the partiality of accommodation in a very naive way (counting the DRs). This is not enough for a full theory of bridging. *There is a clash between the AB theory and the present OT reconstruction. GenØ ( +each, [p] [q/r] ) = { [r, +each, [p] [q]], +each, [p, r] [q], +each, [p] [r, q] } AB: OT: (16) global global ™ ™ ™ ™ interm. local local interm. Every German is proud of his Porsche *There are reasons to replace STRONG by RELEVANCE (be relevant) (see: van Rooy 2000) 4.3 Stressed Pronouns again (8) Paul1 called Jim2 a Republican. <12> <21> ALIGN SAL ANT REL ) *ACC a. Then hex (insulted himy) <12> * * * * b. Then HEx insulted HIMy ) <12> <21> * * * 16 5 Centering and Generation (17) Fred1 was eating. He1 saw Jim2. WINKED(1) ?? (18) a. He winked b. Fred winked c. HE winked Fred1 was eating. He1 saw Jim2. WINKED(2) ?? a. He winked b. Jim winked c. HE winked AGREE o BIND o *ACC o REL o SAL ANT o ALIGN Fred1 was eating. He1 saw Jim2. < 1 2 > ALIGN SAL ANT REL *ACC WINKED(1) ) he winked ) Fred winked HE winked * WINKED(2) ) ) he winked Jim winked HE winked Assume recoverability. Æ Bidirectional OT * * * ) Jim winked * 17 6 Acceptability and Scrambling (19) Was hat Hans dem Schüler gegeben? a. Ich glaube, daß Hans dem Schüler das BUCH gegeben hat. b. ?Ich glaube, daß Hans das BUCH dem Schüler gegeben hat (Anti-Focus Effect) (20) Wem hat Hans das Buch gegeben? c. Ich glaube, daß Hans dem SCHÜLER das Buch gegeben hat. d. Ich glaube, daß Hans das Buch dem SCHÜLER gegeben hat. (21) Wem hat Hans ein Buch gegeben? e. Ich glaube, daß Hans dem SCHÜLER ein Buch gegeben hat. f. ?Ich glaube, daß Hans ein Buch dem SCHÜLER gegeben hat. (Specificity Effect) Basic ideas for an OT treatment (Choi 1996) *expressive optimization (interpretation -> expression) *phrase structural constraints for describing the canonical word order CANON (German): SUBJ – Adjunct – I.Object – D.Object CANON1: SUBJ should be structurally more prominent than (c-command) non-SUBJ functions CANON2: Non-SUBJ functions align reverselly with the cstructure according to the functional hierarchy: GF Order: Subject > D.Object > I.Object > Obl > Adjunct 18 *information structuring constraints for describing the effects of topicalization and focussing cf i background focus topic Prosodic, syntactic and certain lexical properties of Ui provide the division between background and focus for the elements of cf i . Elements of the background tend to bind material in cf i-1. (accommodation is not excluded). Elements of the focus tend to be newly introduced (partial binding is not excluded). NEW: an element in the background that binds earlier material should precede a focussed element. PROM: A prominent element (e.g. a topic) should precede a non-prominent element. Ranking for German: PROM o CANON1 o {CANON2, NEW} Examples (17) Was hat Hans dem Schüler gegeben? CANON2 NEW CANON1 PROM REST… ) Hans-dem Schüler-das BUCH Hans-das BUCH-dem Schüler *.. *.. * * 19 (18) Wem hat Hans das Buch gegeben? CANON2 *.. NEW CANON1 (19) PROM REST… ) Hans-dem SCHÜLER-das Buch ) Hans-das Buch-dem SCHÜLER * *.. * Wem hat Hans ein Buch gegeben? CANON2 NEW CANON1 PROM REST… ) Hans-dem SCHÜLER-ein Buch Hans-ein Buch-dem SCHÜLER *.. *.. * 20 Conclusions *OT as instrument for expressing and integrating different ideas. Advantage of the declarative style. *Constraints and their ranking (German) AGREE o BIND o *ACC o REL o SAL ANT o PROM o CANON1 o {CANON2, NEW} + conditions for constructing the salience lists Cf i . *Syntactic anomalies are explained by blocking (expressive optimization) *Interpretive preferences are explained by interpretive optimization *OT aims to overcome the competence-performance gap. The same constraints can be used for interpretation and generation. Explanation of garden-path effects.