As part of a project
on discourse parsing, we have built a discourse
segmenter based on syntactic and lexical information. A
discourse segmenter takes text as input, and
produces as output the minimal discourse units in the
text.
Our definition of 'minimal discourse unit' is directly
inspired by Rhetorical
Structure Theory. A discourse unit is:
- An independent clause
- A clause in an adjunct relation
SLSeg is described in the following paper:
Here, you can download the entire program, which
includes the following resources:
- A list of clause-like phrases that are in fact
discourse markers (e.g., if you will, mind you).
- A list of verbs used in to-infinitival and
if complement clauses that should not be treated as
separate discourse segments (e.g., decide in
I decided to leave the car at home).
- A list of unambiguous lexical cues for segment
boundary insertion.
- A list of attributive/cognitive verbs (e.g., think,
said) which are used to prevent segmentation of
floating attributive clauses.
Download SLSeg
©2009-2017 Maite Taboada, Milan
Tofiloski, Julian Brooke