CID 2011

CID 2011
September 14th-16th 2011

Var, France

endorsed by SIGSEM SIGSEM


Wednesday 14th Thursday 15th Friday 16th
9h15 Opening
9h30 Invited: Jonathan Ginzburg Invited: Andrew Kehler Invited: Barbara di Eugenio
10h30 Break Break Break
11h R. Fernández L. Danlos & O. Rambow N. Asher et al.
11h45 K. Jasinskaja & E. Karagjosova M. Vergez-Couret et al. N. Van der Vliet & G. Redeker
12h15 Lunch Lunch Closure & Lunch
14h A. Tantos M. Egg
14h45 L. Vieu Poster Session:
K. Alahverdzhieva & A. Lascarides,
A. Gazdik & G. Winterstein,
L. Mayol & E. Castroviejo
C. Roze
M. Gylling & I. Korzen
15h30 Break
16h J. Hunter
16h45 R. Moot et al. End
17h30 End
20h Dinner Dinner


Invited speakers

Barbara di Eugenio Semantic Constraints and Discourse Parsing

Abstract: “Discourse Parsing, the computational segmentation and inference of structure and relations in text, remains a highly challenging task. Efforts have relied mostly on syntactic and lexical information. The use of semantics has been restricted to shallow semantic features such as lexical chains and similarity measures based on word co-occurrences.

I will present an innovative discourse parser that uses rich verb semantics and relational information on the structure of the segment being built. Our discourse parser, based on a modified shift-reduce algorithm, crucially uses a rhetorical relation classifier to determine the site of attachment of a new incoming chunk together with the appropriate relation label. Another novel aspect of our work is that the relation classifier uses Inductive Logic Programming, a method that learns from first-order logic representations. We show that on classifying rhetorical elations, our results are significantly better than attribute-value learning paradigms such as Decision Trees, RIPPER and Naive Bayes. Our work demonstrates that, when available, semantic information for discourse parsing can be used effectively.”

BIO: Barbara Di Eugenio is Associate Professor in the Department of Computer Science of the University of Illinois, Chicago campus. There she leads the NLP laboratory ( She obtained her laurea in Informatica in 1985, from Universita' di Torino, and her PhD in Computer Science in 1993, from the University of Pennsylvania. She is an NSF CAREER awardee, a past treasurer of the North American Chapter of the Association for Computational Linguistics, and a past treasurer of SIGDial, the ACL special interest group for discourse and dialogue; she is also one of the founding and managing editors of the Journal of Discourse and Dialogue Research.

Andrew Kehler A Probabilistic Reconciliation of Coherence-Driven and Centering-Driven Theories of Pronoun Interpretation

Abstract: Two classic theories of pronoun interpretation have each sought to specify the relationship between pronoun use and discourse coherence, but make seemingly irreconcilable claims. According to Hobbs (1979, 1990), pronoun interpretation is not governed by an independent mechanism, but instead comes about as a by-product of utilizing world knowledge during the inferential establishment of discourse coherence relations. Factors pertaining to the grammatical form and information structure of utterances do not come into play. According to Centering Theory (Grosz et al. 1986/1995, inter alia), on the other hand, pronoun interpretation is predominantly determined by information structural relationships within and between utterances (e.g., topic transitions) and the grammatical roles occupied by potential referents. Factors pertaining to world knowledge and the establishment of informational coherence relations do not come into play.

In this talk I describe a series of psycholinguistic experiments that ultimately suggest a reconciliation of these diverse approaches. These experiments reveal a definitive role for coherence relationships of the Hobbsian sort, demonstrating that pronoun interpretation is affected by (i) probabilistic expectations that hearers have about what coherence relationships will ensue, and (ii) their expectations about what entities will be mentioned next which, crucially, are conditioned on those coherence relationships. However, these experiments also reveal a role played by the grammatical and/or topichood status of potential referents. These data are reconciled by a probabilistic model that combines the hearer's coherence-driven prior expectations about what entities will be referred to next and Centering-driven likelihoods that govern the speaker's choice of referential form. The approach therefore situates pronoun interpretation within a larger body of work in psycholinguistics, according to which language interpretation results when top-down predictions about the ensuing message meet the bottom-up linguistic evidence.

This talk contains joint work with Hannah Rohde, Jeffrey Elman, and Staci Osborn.

Grosz, Barbara J., Aravind K. Joshi & Scott Weinstein. 1995. Centering: A framework for modelling the local coherence of discourse. Computational Linguistics 21(2). 203–225.

Hobbs, Jerry R. 1979. Coherence and coreference. Cognitive Science 3. 67–90.

Hobbs, Jerry R. 1990. Literature and cognition. Stanford, CA: CSLI Lecture Notes 21.

BIO: Andrew Kehler is Professor and Chair of Linguistics at the University of California, San Diego. He holds the B.S.E. degree in Computer Science and Engineering from the University of Pennsylvania, and the S.M. and Ph.D. degrees in Computer Science from Harvard University. Before arriving at UCSD in 2000, he served as Senior Computer Scientist at SRI International. He has published numerous articles on pragmatics and discourse interpretation studied from the perspectives of theoretical linguistics, computational linguistics, and psycholinguistics, and is author of the book Coherence, Reference, and the Theory of Grammar (2002).

Jonathan Ginzburg Disfluencies as Intra-Utterance Dialogue Moves

Abstract: There are a number of approaches to analyzing dislfluencies, exemplified in (1):

(1) From Levelt (1989): To the right is yellow, and to the right – further to the right is blue.

From Levelt (1989): We go straight on, or– we enter via red, then go straight on to green.

From Besser and Alexandersson (2007): The design of or– the point of putting two sensors on each side

From Fay (1980), cited by Levelt (1989): Why it is – why is it that nobody makes a decent toilet seat?

From Levelt (1989): Tell me, uh what– d'you need a hot sauce?

In the conversational analysis tradition (following Schegloff, Jefferson, and Sacks 1977) disfluencies are viewed as a subtype of repair. There are many insights in this type of approach, but it has typically not integrated into a formal model of grammar/dialogue. An alternative approach, common in the computational literature, has been to view disfluencies as filtered away by low-level processes, so that there is no interpretation of disfluencies, that the interpreter (the level of computation of dialogue meaning) doesn't see disfluencies (e.g. Heeman and Allen 1999). Recently, evidence from psycholinguistics has begun emerging that self-corrected material has a long-term processing effect (e.g. Brennan and Schober 20001, Arnold et al 2007), hence is not being 'edited away'. It can also bring about linguistic effects in whose interpretation it plays a significant role, for instance anaphora, as in (2a) from (Heeman and Allen 1999). In fact, disfluencies yield information: (2a) entails (2b) and defeasibly (2c), which in certain settings (e.g.\ legal), given sufficient data, can be useful.

(2a) Andy: Peter was, well he was fired.

(2b) Andy was unsure about what he should say, after uttering `was'.

(2c) Andy was unsure about how to describe what happened to Peter.

In this talk I present a detailed formal account of disfluencies within the framework of KoS (Ginzburg 1994, Larsson 2002, Purver 2006, Ginzburg and Fernandez 2010, Ginzburg 2012) which:

1. unifies self- and other-repair without conflating them,

2. offers a precise explication of the roles of all key components of a disfluency, including editing phrases and filled pauses,

3. accounts for the possibility of self-addressed questions in a disfluency.


J.E. Arnold and Kam, C.L.H. and Tanenhaus, M.K. 2007, `If you say `thee uh' you are describing something hard: The on-line attribution of disfluency during reference comprehension.', Journal of Experimental Psychology: Learning, Memory, and Cognition 33: 914-930.

Susan E. Brennan and Michael F. Schober 2001, `How Listeners Compensate for Disfluencies in Spontaneous Speech', Journal of Memory and Language 44: 274–296.

Jonathan Ginzburg 1994, `An update semantics for dialogue'. In H. Bunt, ed., _Proceedings of the 1st international workshop on computational semantics_ . ITK, Tilburg University, Tilburg, Netherlands.

Jonathan Ginzburg 2012 _The interactive stance: Meaning for conversation_ . Oxford: Oxford University Press.

Jonathan Ginzburg and R. Fernandez. 2010. `Computational Models of Dialogue'. In A. Clark, C. Fox, and S. Lappin, eds., _Handbook of computational linguistics and natural language_ . Oxford: Blackwell.

Peter A. Heeman and James F. Allen, 1999, `Speech Repairs, Intonational Phrases and Discourse Markers: Modeling Speakers' Utterances in Spoken Dialogue', Computational Linguistics 25: 527–571

Staffan Larsson 2002. _Issue based dialogue management_ . Ph.D. thesis, Gothenburg University.

W.J.M. Levelt 1989, _Speaking: From intention to articulation_ The MIT Press, Cambridge, MA.

Matthew Purver 2006, `Clarie: Handling clarification requests in a dialogue system'. Research on Language and Computation , 4: 259-288.

Emanuel Schegloff and Gail Jefferson and Harvey Sacks 1977,`The preference for self-correction in the organization of repair in conversation', Language 53: 361–382.

(joint work with Raquel Fernandez and David Schlangen)

BIO : Jonathan Ginzburg has held appointments at the Hebrew University of Jerusalem and King's College, London. He is currently Professor of Linguistics at Universite Paris-Diderot (Paris 7). He is the author of Interrogative Investigations: the form, meaning, and use of English Interrogatives (jointly with Ivan A. Sag) and has published more than 70 papers. He is one of the founders and currently editor-in-chief of Dialogue and Discourse, one of the Linguistic Society of America's ejournals.

Oral presentations

Nynke Van Der Vliet and Gisela Redeker Complex sentences as leaky units in discourse parsing

It is usually assumed that complex sentences with multiple clauses function as rhetorical units in discourse. We show that there are rare but systematic exceptions to this general assumption: structures where a sentence-external unit attaches to one of the clauses in a complex sentence before the combined span joins the rest of the complex sentence. In our Dutch RST-annotated text corpus, 13% of the complex sentences have such ’leaky’ boundaries. The majority of the cases have a structure that can be accommodated by a sentence-first parser. Still, ’leaky’ complex sentences are an intriguing phenomenon, and we therefore intend to further explore their semantic, syntactic, and functional characteristics.

Markus Egg. Discourse particles between cohesion and coherence

This paper discusses relational discourse particles as a device for the organisation of texts that holds the middle ground between cohesion and coherence. They are cohesive devices like “then” and other discourse anaphors, which link whole discourse segments directly but do not contribute to discourse structure proper. But they resemble conjunctions and other discourse markers in that they introduce relations between discourse segments that refer to inference patterns from the common ground, e.g., denial of expectation. In addition, they can refer to the literal content of segments or to their felicity conditions just like discourse markers.

Katja Jasinskaja and Elena Karagjosova. Elaboration and Explanation

In this paper we study two expressive patterns shared between elaboration and explanation relations: unmarked connection, i.e. juxtaposition of sentences without any explicit marker, and the German marker `nämlich' (namely), which must have emerged as a marker of specification but has spread in the direction of explanation. We try to answer the question what is common to elaboration and explanation relations which licenses the use of same expressive patterns, and argue that elaboration and explanation are closely connected in the conceptual space of discourse relations.

Nicholas Asher, Antoine Venant, Philippe Muller and Stergos Afantenos. Complex discourse units and their semantics

A natural and intuitive principle concerning the organization of content in discourse is that discourse structure and rhetorical function operate at several levels of granularity at once. There are low level discourse connections between elementary discourse units (EDUs), even within a single sentence; but there are also discourse connections between larger constituents, complex discourse units or CDUs, which may include only two or three EDUs or may correspond to several paragraphs. CDUs and the constraints they impose on the discourse structure have not been an object of study in computational or formal work on discourse, as they are generally the by-product of processes either focused on elementary units (eg in RST) or on thematic cohesion (eg in text tiling). The purpose of this paper is to fill this lacuna. First, we give some more details about the importance of CDUs in an account of discourse structure. We then provide formal definitions of equivalences involving discourse graphs, which enables us to prove some results about how CDUS relate to EDUs and to each other. This in turn leads us to provide separation axioms or existence principles for CDUs. We work within the framework of SDRT (Asher 1993, Asher and Lascarides 2003).

Julie Hunter. `Now': A Discourse-Based Theory

English `now' depends on a perspective point which need not be given by the time of utterance. Contrary to existing theories of `now', I claim that this perspective point is determined by the rhetorical structure of the discourse. The details of my theory are presented in Segmented Discourse Representation Theory and are supported by over 150 examples of `now' from recognized newspapers and published narratives. The general picture is that `now' imposes structure on a temporal ordering; it divides a given period from that which comes before and from that which comes after. `Now' also has a spotlighting effect, so the discourse must call for special attention to the events/states described by the `now' clause; the clause must contribute to the main point of the story, rather than to background information.

Raquel Fernández. Incremental Resolution of Relative Adjectives: A DRT-based Approach

This paper is concerned with the incremental interpretation of exophoric referring descriptions with relative gradable adjectives, such as the description in the instruction `Pick up the tall glass'. Relative adjective such as `tall' are typically considered subsective with respect to the head noun and thus pose a challenge to incremental approaches that operate strictly from left to right. However, psycholinguistic research within the eye-tracking paradigm has shown that relative adjectives are interpreted incrementally: listeners interpret them in an incremental manner as they encounter them during processing, without need to wait until the head noun has been heard. Our aim in this paper is to provide a formal account of the incremental interpretation of descriptions with relative gradable adjectives that reflects the psycholinguistic evidence. We propose a DRT-based treatment that combines the PTT approach to incremental interpretation developed by Poesio and Rieser (2010, 2011) with the main ingredients of Van der Sandt's presupposition-as-anaphora theory, as implemented by Bos (2003).

Richard Moot, Laurent Prévot and Christian Retore. Discursive analysis of itineraries in an historical and regional corpus of travels: syntax, semantics, and pragmatics in a unified type theoretical framework

In this paper we will discuss the application of (Segmented) Discourse Representation Theory to the analysis a historical French corpus of itineraries in the Pyrénées. Our research will focus in particular on how type coercion can help us give a correct analysis of cases of so-called ``fictive motion''.

Laurence Danlos and Owen Rambow. Discourse Relations and Propositional Attitudes

We propose a procedure how discourse should be processed when segments are not asserted by the writer but attributed by her to other sources.

Laure Vieu. On the Semantics of Discourse Relations

In this paper, I examine the division of labour between discourse semantics and information packaging and reconsider the schemata for the semantics of veridical discourse relations given in SDRT. On the basis of studies of the phenomena of discourse relation blocking, I claim that one cannot reduce the semantics of discourse relations to their content-level semantic effects. I propose revised semantic schemata involving public commitment operators to characterize the rhetorical import of discourse relations within their semantics.

Alexandros Tantos. Discourse Constraints of Clitic Left Dislocation in Modern Greek

The purpose of this talk is to illustrate the importance of Clitic Left Dislocation (CLLD) in Greek for inferences of intersentential relations. SDRT (Segmented Discourse Representation Theory, Asher and Lascarides 2003), a formal discourse semantic theory that aims to present the discourse logical tree enhanced with rhetorical relations (Explanation, Narration, Elaboration among others) is used as a vehicle to approach CLLD from a different perspective. CLLD is important for discourse coherence in two points: the clitic cannot be replaced in specific cases where an anaphoric (bridging or identity) relation with an antecedent referent in a previous sentence is aimed and furthermore it triggers subordinating rhetorical relations (Asher and Vieau 2005) to discourse accessible segments and disallows coordinating ones.

Marianne Vergez-Couret, Myriam Bras, Laurent Prévot, Laure Vieu and Caroline Attalah Discourse contribution of Enumerative Structures involving pour deux raisons

We propose to study the discourse contribution of enumerative structures involving the prepositional phrase pour deux raisons. We would like to highlight the contribution of the textual information conveyed by enumerative structures and the prepositional phrase both to the discourse structure and the discourse content within the SDRT model. We will show that prepositional phrase like pour deux raisons must introduce a discourse constituent in the structure attached by the Commentary relation to the left context and the Enumeration relation to the right context. Finally we propose to treat pour deux raisons as a new kind of discourse marker: We will show that its discursive role within enumerative structures is to signal the content-level relation Explanation.


Katya Alahverdzhieva and Alex Lascarides. Semantic Composition of Multimodal Communicative Actions in Constraint-based Grammars

The past few decades have witnessed substantial research in spontaneous, improvised co-speech gestures performed in synchrony with speech, e.g., Kendon (1972), McNeill (1992). The vast majority of the descriptive, cognitive and formal studies of gesture unanimously acknowledge the fact that speech and gesture function within a single communicative system to convey an integrated meaning through spoken and visual material. In this paper, we take the integrated nature of the speech-gesture action as a starting point, and we demonstrate that well-established mechanisms for semantic composition from linguistics can be applied to multimodal communicative actions consisting of speech and co-speech hand gestures. In particular, we use the constraint-based grammar formalism of HPSG (Pollard and Sag, 1994) and the semantic framework of Robust Minimal Recursion Semantics (RMRS) (Copestake, 2007) to map the form of the multimodal signal to an (underspecified) meaning representation.

Laia Mayol and Elena Castroviejo. The connective "doncs" in dialogue and the QUD

This paper addresses the discourse use of the Catalan discourse connective “doncs” (and Spanish “pues”), which has evolved from being a temporal connective meaning “then” (“tunc” in Latin) into being a more elusive discourse connective. As a matter of fact, “doncs” does not have a uniform semantic and pragmatic import. Here we focus on the “doncs” that retains some of its original semantics, and compare it to French “donc” and to English “then” to conclude that it takes as antecedent either an if-clause or a previous utterance, and it presupposes that the speaker evokes a set of alternatives to the consequent which constitute worse scenarios than the actual one.

Anna Gazdik and Grégoire Winterstein. A Discursive Approach to Discourse Functions in Hungarian

In this paper, our aim is to propose an analysis of discourse functions in Hungarian, a discourse-configurational language. We concentrate on the preverbal position and demonstrate that the exact position of constituents bearing a particular discourse function depends on the discourse relation the sentence is part of, and discourse functions can by no means be exclusively assigned to a designated syntactic position, as it has been proposed in the literature.

Charlotte Roze. Towards a Discourse Relation Algebra for Comparing Discourse Structures

We propose a methodology for building discourse relations inference rules, to be integrated into an algebra of these relations. The construction of these rules has as main objective to allow for the calculation of the discourse closure of a structure, i.e. deduce all the discourse relations it implicitly contains. Calculating the closure of discourse structures improves their comparison, in particular within the evaluation of discourse parsing systems. We present and illustrate the adopted methodology, taking as theoretical background the Segmented Discourse Representation Theory.

Morten Gylling and Iørn Korzen. Discourse Constraints in a Cross-linguistic Typological Perspective

This paper examines some typological differences in the discourse structure of Italian and Danish. The results of the study indicate that there are significant differences in information packing in the two languages, especially in their use of deverbalisation. Italian sentences tend to include more propositions (and other EDUs), of which a higher percentage is backgrounded by means of non-finite and nominalised predicates, whereas Danish text structure is more informationally linear and characteristic of a higher degree of finite verbs and topic shifts. The study also suggests that a more fine-grained classification of non-finite and nominalised EDUs is needed for a complete in-depth analysis of discourse constraints in different language families.