A. Borusan

09.07.2012, 16 Uhr c.t. TU Berlin, EN building, seminar room EN 719 (7th floor), Einsteinufer 17, 10587 Berlin: "Event-centric Information Extraction and Retrieval to Explore Document Collections" (Jannik Strötgen, Uni Heidelberg)

In this talk, we present our work on event-centric information extraction and retrieval with an event being simply defined as a combination of spatial and temporal information. For this, we first introduce our multilingual, cross-domain temporal tagger HeidelTime and describe some challenges occurring when extracting and normalizing temporal expressions from text documents of different domains. Then, we present our system to perform event-centric search and exploration in document collections, which allows, for example, to specify spatial and temporal query constraints and to retrieve as search result sequences of relevant events extracted from different documents instead of a hit list of documents containing such events. Finally, we present our model for calculating event-centric document similarities. In contrast to standard term-based similarity models, our approach is directly based on the semantics of the events and such term- and language-independent.