1 - Reinhard Blutner

Transcrição

1 - Reinhard Blutner
Optimalitätstheorie und lokale Kohärenz
Reinhard Blutner, Berlin
www.blutner.de/ot
[email protected]
Die ursprünglich in der Phonologie entwickelte Optimalitätstheorie (Prince & Smolensky 1993)
hat in den letzen Jahren auch Anwendungen in Morphologie, Syntax, Semantik und Pragmatik
gewonnen. Im Vortrag soll diese Theorie auf das Problem der Anapherninterpretation im
Diskurskontext angewendet werden. Der erste Teil des Vortrags konzentriert sich auf das
"Centering"-Modell der lokalen Kohärenz von Brennan, Friedman & Pollard (1987), und ich
zeige, daß eine optimalitätstheoretische Rekonstruktion dieses Modells möglich ist (vgl. Beaver,
im Druck). Ein genereller Vorzug der optimalitätstheoretischen Analyse ist ihr deklarativer Stil,
der es erlaubt, den ursprünglich für die Interpretation entwickelten Ansatz auch für die
Generierung zu verwenden. Im zweiten Teil des Vortrages sollen einige Modifikationen und
Verbesserungen am Basismodell vorgenommen werden. Ziel dieser Etüde ist es, einige der
Probleme zu lösen, die im Zusammenhang mit fokussierten Pronomen stehen.
Der Vortrag versucht nicht, die Vielfalt der Rätsel zur lokalen Kohärenz auch nur
andeutungsweise zu thematisieren. Vielmehr wähle ich dieses Beispiel als glücklichen
Illustrationsfall, um Grundideen der Optimalitätstheorie zu illustrieren und gleichzeitig
Argumente dafür zu gewinnen, daß dieses Rahmenmodell ein produktives Instrument in der
Theorienbildung sein kann und wichtige Querverbindungen herstellt (Produktion-Generierung,
Kompetenz-Performanz, Erwerbsproblematik), die das allgemeine Interesse des
Kognitionswissenschaftlers beanspruchen dürfen.
1
Einführendes Beispiel
2
Architektur der Optimalitätstheorie
- OT als integrative Theorie
- OT als Instrument der Modellbildung
3
Centering-Modell der lokalen Kohärenz
- BFN-Modell
- Optimalitätstheoretische Rekonstruktion
- Vorzüge des deklarativen Stils
4
Centering und Fokus
5
Centering und Generierung
6
Akzeptabilität und Scrambling
2
1
Ethics for robots
Isaac Asimov described what became the most famous view of
ethical rules for robot behaviour in his “three laws of robotics”
(Thanks to Bart Geurts for drawing my attention to this example):
Three Laws of Robotics:
1. A robot may not injure a human being, or, through
inaction, allow a human being to come to harm.
2. A robot must obey the orders given it by human beings,
except where such orders would conflict with the First
Law.
3. A robot must protect its own existence, as long as such
protection does not conflict with the First or Second Law.
(Asimov, Isaac: I, Robot. Gnome Press 1950)
This sentence actually contains three independent constraints:
1. A robot may not injure a human being, or, through
inaction, allow a human being to come to harm.
2. A robot must obey the orders given it by human beings.
3. A robot must protect its own existence.
From an optimality theory point of view, we can think of this as
three constraints, where each one overrides the subsequent. The
effect of overriding is described by a ranking of the constraints:
1 o 2 o 3,
i.e.: *INJURE HUMAN o OBEY ORDER o PROTECT EXISTENCE
3
Story A:
Human says to Robot: Kill my wife!
1. R kills H’s wife
2. R kills H (who gave him the order)
3. R doesn’t kill anyone
4. R kills himself.
Standard optimality tableau
(L marks the optimal candidate, "*!" the fatal constraint
violation):
TABLEAU FOR
STORY A
*INJURE
HUMAN
1. R kills H’s wife
*!
2. R kills H
*!
L 3. R doesn’t kill anyone
4. R kills himself
OBEY
ORDER
PROTECT
EXISTENCE
*
*
*
*!
In the example, the story relates to a certain situation type that
generates the possible reactions 1-4. Furthermore, the extension
of the constraints is relative to this situation type.
R’s optimal reaction to H’s order is to do nothing (line 3). All
other reactions are suboptimal.
4
Story B:
Human says to Robot: Kill my wife or I kill her!
TABLEAU FOR
STORY B
L 1. R kills H’s wife
*INJURE
HUMAN
OBEY
ORDER
PROTECT
EXISTENCE
*
2. R kills H
*
*
3. R doesn’t kill anyone
*
*
4. R kills himself
*
*
*
R’s optimal reaction to H’s order is to kill H’s wife.
Story C:
Human says to Robot: Kill my wife or I destroy you!
TABLEAU FOR
Story C
*INJURE
HUMAN
1. R kills H’s wife
*
2. R kills H
*
OBEY
ORDER
PROTECT
EXISTENCE
*
L 3. R doesn’t kill anyone
*
*
L 4. R kills himself
*
*
There are two optimal reaction to H’s order: R does nothing
(then he is killed by H), or he kills himself.
5
2
Basic architecture of OT
The GENerator
determines
the
possible inputs,
the
possible
outputs, and the
possible
correspondences
between
inputs
and outputs. For a
given input, GEN
creates a candidate set of possible outputs. OT doesn’t provide a ‘theory’ for
GEN, rather it presupposes it. (OT is not a theory of
representations!)
The universal CONstraint set is assumed to be part of our
innate knowledge of language. Each constraint can be seen as a
markedness statement. Constraints can be ranked. This reflects
the relative importance of the different markedness statements
for a particular language.
EVALuation us the mechanism which selects the optimal
candidate(s) from the candidate set generated by GEN. EVAL
makes use of the ranking of the violable constraints. The optimal
output, the one that is selected by EVAL is the one that best
satisfies these constraints.
6
General Remarks
*OT first developed in phonology, later syntax, semantics,
interpretation
*Basic idea:
o violable (universal) constraints.
o language-particular ranking of the constraints.
*Conceptual advantage: integrative framework
o general theory of Grammar and perhaps several other
cognitive domains (discourse interpretation, vision,
music).
o interesting learning theory
o overcomes the competence-performance gap
o combinatorial typology
o founded in harmony theory (connectionism)
*main problems:
o often unmotivated stipulation of constraints
o for semantics/pragmatics: stipulation of a universal
ranking of the constraint
7
3
Centering
A framework for modelling the local coherence of discourse
*Grosz, Joshi, & Weinstein (1983, 1995)
*Brennan, Friedman, & Pollard (1987) [BFN]
*Kameyama (1994, 1998)
*Strube (1996, 1998)
(1)
a. Terry really goofs sometimes.
b. Yesterday was a beautiful day and he was excited
about trying out his new sailboat.
c. He wanted Tony to join him at a sailing expedition.
d. He called him at 6 PM.
e. He was sick and furious at being woken up so early.
Grosz, Joshi, & Weinstein (1995)
(2)
... Die Hexe sagte noch, dass Hänsel und Gretel Brot in
einen Ofen tuen sollten. Hänsel wurde in einen Käfig
gesteckt, nachdem sie sich geweigert hatten es zu tun.
Gretel musste alles für die Hexe tuen. Als Gretel wieder
das Brot in den Ofen tuen sollte sagte sie der Hexe dass
sie es ihr erst einmal zeigen sollte, wie man so etwas
macht. Sie tat es, aber Gretel schob sie tiefer hinein und
verschloss die Ofentür mit einem Riegel. Sie konnte jetzt
nicht mehr heraus. Das einzige was sie tun konnte ist
verbrennen. Nun befreite Gretel Hänsel aus dem Käfig. Sie nahmen
Gold und Edelsteine der Hexe mit und fanden den Weg nach Hause. Sie
gingen in das ihr Haus und erzählten was sie erlebt hatten. Als sie ihren
Eltern zum Schluss noch die Funde gaben waren sie alle glücklich. Und
wenn sie nicht gestorben sind, dann leben sie noch heute!
8
3.1
The general idea of modelling centering
*Each utterance Ui defines a transition between an input
context Ci-1 and an output context Ci. A context C is a
dynamically evolving cognitive information state.
*A component in context C is the centering state AttCen
consisting of a set of propositions with associated entities.
The entities in AttCen are partially ordered by salience. That
is represented by the forward looking center list Cf .
*Attentional state is related to ease of inference: certain
inferences associated with salient entities are made more
easily than comparable inferences unrelated to salient
entities.
*One of the entities in Cf may be the backward looking center
Cb, or the center (sometimes called the topic), the central
entity that the discourse is currently about.
U1
Cf
Cb
C0
< >
U2
C1
<ij>
U3
C2
<ijk>
C3
<ijl>
Definition of the Center? Definition of the salience ordering?
9
3.2 The BFN Model
*The Center of Ci is the highest ranked entity in Cf i-1 that is
part of Cf i
*Other Filters: (i) Syntactic binding constraints; (ii) if there
are pronouns in Ui, then one of them refers to Cb i (Rule 1)
*GF Order: Subject > D.Object > I.Object > Obl
predicts the relative salience of entities in the output
attentional state Cf i
*The way in which the AttCen changes may be classified into
a small number of transition types. Two properties are
crucial for the classification:
COHERE: The center doesn’t change
ALIGN:
The center aligns to the most salient entity in Cf
COHERE
ALIGN
*
Continue
(3)
*
Retain
Smooth-shift
Jane1 likes Mary2
She1 often brings her2 flowers3
She1/2 chats with her2/1 for ages
*
*
Rough-shift
<12>
<123>
<12>/<21>
continue ™ retain; i.e. she=Jane
(4)
Jane1 is happy
She1 was congratulated by Freda2
and Mary3 asked her1/2 a question4
<1>
<12>
< 3 1 4> / < 3 2 4>
retain ™ smooth-shift; i.e. her=Jane
*Continue ™ Retain ™ Smooth-shift ™ Rough-shift (Rule 2)
10
3.3 Beaver’s reformulation of centering in OT
*Continue ™ Retain ™ Smooth-shift ™ Rough-shift
can be reconstructed by assuming COHERE o ALIGN
COHERE
ALIGN
*
Continue
*
Retain
*
*
Rough-shift
Smooth-shift
*Top-ranked constraints BIND (and AGREE)
*AGREE o BIND o PRO/TOP o COHERE o ALIGN
(1)
*
ALIGN
COHERE
Mary3 asked herx0{1, 2} a question4
PRO/TOP
<1>
<12>
< 3 1 4>
< 3 2 4>
< 3 1 4>
ALIGN
*
Jane1 is happy
She1 was congratulated by Freda2
)
COHERE
<12>
<21>
<21>
BINDING
AGREE
(2)
PRO/TOP
)
<12>
<123>
BINDING
AGREE
Jane1 likes Mary2
She1 often brings her2 flowers3
Shex0{1, 2} chats with her y0{1, 2} for
ages
*
*
*
*
*
11
3.4 Some problems
Ø Kameyama (1998) p. 4
(5)
John1 went to Jim2’s party
<12>
He2 was pleased to see John1 again < 2 1 >
<1>/<2 >
He1/2 had just recovered from a
stressful week at work.
continue ™ smooth-shift; i.e. he: John1 ™ Jim2
Intuitively, he: Jim2 ™ John1
í Werner Frey (p.c.)
(6)
Jane1 is happy
Mary2 gave her1 a present3
She1/2 smiled
<1>
<213>
<1>/<2>
continue ™ smooth-shift; i.e. she: 1 ™ 2
(7)
Jane1 is happy
Mary2 gave her1 a present3
She1/2 smiled at her2/1
(1 ™— 2)
<1>
<213>
<12>/<2 1>
smooth-shift ™ rough-shift; i.e. she: 2 ™ 1 (1 ™— 2)
î Kameyama (1998), p.6
(8)
Barbar1 went to a bakery2’
He1 greeted the baker2
He1/2 pointed at a blueberry pie3
he: 1 ™ 2
< 1 2’ >
<12>
<13>/<23>
(9)
Barbar1 went to a bakery2’
The baker2 greeted him1
He1/2 pointed at a blueberry pie3
he: 1 ™— 2
< 1 2’ >
<21>
<13>/<23>
12
3.5
An alternative model (inspired by Strube)
*Cf i determined by merging several hierarchies
o GF: Subject > Object > Object2 > Others
o Linear Precedence
o EXP order: ZeroPronominal > Pronoun > Def-NP
> ProperName > Indef-NP
*The different transition types in centering are not significant
for predicting preferences. SALIENT ANTECEDENT instead of
COHERE.
*Incremental processing (word by word)
*Readers or listeners can be misled or ‘quoted up the garden
path’ by locally ambiguous sentences. Garden-path effects
are predicted if optimal resolutions (corresponding to some
early input) cannot be extended
(3)
John1 went to Jim2’s party
<12>
He
<12>
was pleased to see
John1
GPE!!
again < 2 1 >
He1/2 had just recovered from a
stressful week at work.
<1>/<2 >
AGREE o BIND o PRO/TOP o SAL ANT o ALIGN
13
4
Centering and Focus
4.1
The complementary preference hypothesis
(10)
a. Paul called Jim a Republican. Then he insulted him.
(Paul insulted Jim)
b. Paul called Jim a Republican. Then HE insulted HIM.
(Jim insulted Paul)
(Lakoff 1971)
Complementary Preference Hypothesis
A focused pronoun takes the complementary
preference of the unstressed counterpart.
(Kameyama 1994)
(11)
Paul called Jane a Republican. Then SHE insulted HIM.
(Prince 1981)
(12)
Paul insulted Jane. Then she HIT him.
(de Hoop 2001)
De Hoop’s conclusion: The stress on the pronouns seems to be
the result of the focus structure of the sentence, rather a shift in
preferred reference. (Consequently, the complementary
preference hypothesis is irrelevant, at best.)
(13)
Barbar went to a bakery. He greeted the baker.
He / ??HE pointed at a blueberry pie. (Kameyama 1994)
(14)
Barbar went to a bakery. He pointed at the baker.
Then he / HE pointed at a blueberry pie.
(J.Mattausch, p.c. 2001)
14
4.2
Motivating two new constraints
The AB theory of presupposition projection (van der Sandt
1992, Geurts 1995)
*Presupposition inducers introduce an element of structural
underspecification into the DRS.
*A presupposition may be bound or accommodated in any
DRS that subordinates the DRS in which it originates (the
semantics leaves open where accommodation/binding
occurs!)
*The projection process is restricted by general preferences
(15)
a. If Peter has a dog, then his cat is gray
(global)
b. If Peter has a cat, then his cat is gray (intermediate)
c. GenØ ([p | q/r])={[r, [p|q]], [[p, r]|q], [p|[r, q]]
global
interm.
local
Preferences according to van der Sandt/Geurts
(i) If a presupposition can both be bound or accommodated,
there will in general be a preference for the first option, and
(ii) If a presupposition can be accommodated at two different
sites, one of which is subordinate to the other, the higher
site will, ceteris paribus, be preferred.
Constraints in OT
**ACC: Avoid Accommodation. It counts the number of
discourse referents that are involved in accommodation.
*STRONG: Be Strong. It evaluates i/o pairs with stronger
outputs higher than pairs with weaker ones.
**ACC » STRONG
15
Open Questions
*The present formulation of *ACC accounts for the partiality
of accommodation in a very naive way (counting the DRs).
This is not enough for a full theory of bridging.
*There is a clash between the AB theory and
the present OT reconstruction.
GenØ ( +each, [p] [q/r] ) =
{ [r, +each, [p] [q]], +each, [p, r] [q], +each, [p] [r, q] }
AB:
OT:
(16)
global
global
™
™
™
™
interm.
local
local
interm.
Every German is proud of his Porsche
*There are reasons to replace STRONG by RELEVANCE (be
relevant) (see: van Rooy 2000)
4.3 Stressed Pronouns again
(8)
Paul1 called Jim2 a Republican.
<12>
<21>
ALIGN
SAL ANT
REL
)
*ACC
a. Then hex (insulted himy)
<12>
*
*
*
*
b. Then HEx insulted HIMy
)
<12>
<21>
*
*
*
16
5
Centering and Generation
(17)
Fred1 was eating. He1 saw Jim2.
WINKED(1) ??
(18)
a. He winked
b. Fred winked
c. HE winked
Fred1 was eating. He1 saw Jim2.
WINKED(2) ??
a. He winked
b. Jim winked
c. HE winked
AGREE o BIND o *ACC o REL o SAL ANT o ALIGN
Fred1 was eating. He1 saw Jim2. < 1 2 >
ALIGN
SAL ANT
REL
*ACC
WINKED(1)
) he winked
) Fred winked
HE winked
*
WINKED(2)
)
)
he winked
Jim winked
HE winked
Assume recoverability.
Æ Bidirectional OT
*
*
*
) Jim winked
*
17
6
Acceptability and Scrambling
(19)
Was hat Hans dem Schüler gegeben?
a. Ich glaube, daß Hans dem Schüler das BUCH
gegeben hat.
b. ?Ich glaube, daß Hans das BUCH dem Schüler
gegeben hat
(Anti-Focus Effect)
(20)
Wem hat Hans das Buch gegeben?
c. Ich glaube, daß Hans dem SCHÜLER das Buch
gegeben hat.
d. Ich glaube, daß Hans das Buch dem SCHÜLER
gegeben hat.
(21)
Wem hat Hans ein Buch gegeben?
e. Ich glaube, daß Hans dem SCHÜLER ein Buch
gegeben hat.
f. ?Ich glaube, daß Hans ein Buch dem SCHÜLER
gegeben hat.
(Specificity Effect)
Basic ideas for an OT treatment (Choi 1996)
*expressive optimization (interpretation -> expression)
*phrase structural constraints for describing the canonical
word order
CANON (German): SUBJ – Adjunct – I.Object – D.Object
CANON1: SUBJ should be structurally more prominent than
(c-command) non-SUBJ functions
CANON2: Non-SUBJ functions align reverselly with the cstructure according to the functional hierarchy:
GF Order: Subject > D.Object > I.Object > Obl > Adjunct
18
*information structuring constraints for describing the effects
of topicalization and focussing
cf i
background
focus
topic
Prosodic, syntactic and certain lexical properties of Ui
provide the division between background and focus for
the elements of cf i . Elements of the background tend to
bind material in cf i-1. (accommodation is not excluded).
Elements of the focus tend to be newly introduced
(partial binding is not excluded).
NEW: an element in the background that binds earlier
material should precede a focussed element.
PROM: A prominent element (e.g. a topic) should precede
a non-prominent element.
Ranking for German: PROM o CANON1 o {CANON2, NEW}
Examples
(17)
Was hat Hans dem Schüler gegeben?
CANON2
NEW
CANON1
PROM
REST…
) Hans-dem Schüler-das BUCH
Hans-das BUCH-dem Schüler
*..
*..
*
*
19
(18)
Wem hat Hans das Buch gegeben?
CANON2
*..
NEW
CANON1
(19)
PROM
REST…
) Hans-dem SCHÜLER-das Buch
) Hans-das Buch-dem SCHÜLER
*
*..
*
Wem hat Hans ein Buch gegeben?
CANON2
NEW
CANON1
PROM
REST…
) Hans-dem SCHÜLER-ein Buch
Hans-ein Buch-dem SCHÜLER
*..
*..
*
20
Conclusions
*OT as instrument for expressing and integrating different
ideas. Advantage of the declarative style.
*Constraints and their ranking (German)
AGREE o BIND o *ACC o REL o SAL ANT o
PROM o CANON1 o {CANON2, NEW}
+ conditions for constructing the salience lists Cf i .
*Syntactic anomalies are explained by blocking (expressive
optimization)
*Interpretive preferences are explained by interpretive
optimization
*OT aims to overcome the competence-performance gap. The
same constraints can be used for interpretation and
generation. Explanation of garden-path effects.

Documentos relacionados