Representing complex temporal phenomena for the semantic web and natural language

  • Authors:
  • Jerry R. Hobbs;Feng Pan

  • Affiliations:
  • University of Southern California;University of Southern California

  • Venue:
  • Representing complex temporal phenomena for the semantic web and natural language
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

As an essential dimension of our information space, time plays a very important role in every aspect of our lives. A specification of temporal information is necessarily required for a large group of applications, including the Semantic Web and natural language. In response to this need, we have developed a rich ontology of temporal concepts, OWL-Time (formerly DAML-Time), for describing the temporal content of Web pages and the temporal properties of Web services. Since most of the information on the Web is in natural language, it can also be used for temporal reasoning and to increase the temporal awareness for different natural language applications. The ontology is represented in first-order logic (FOL) and the OWL Web Ontology Language. The ontology covers a very rich set of temporal concepts. It extends Hobbs (2002)'s work with more complex temporal phenomena, such as temporal aggregates, temporal arithmetic mixing months and days, and vague event durations. We have also created axioms that map subsets of the problems that can be represented by the ontology in FOL to temporal constraint-based formalisms for more efficient temporal reasoning. The temporal aggregate part of the ontology is rich enough to handle both complex multiple-layered and conditional temporal aggregates. A systematic way of mapping recurrence sets in iCalendar (iCal) to temporal aggregates in OWL-Time was developed to give it access to the full ontology of time for temporal reasoning. A set of rules for temporal arithmetic mixing months and days were developed with consideration of different desired arithmetic properties, such as commutativity and associativity. Since missing explicit and exact durations is one of the most common sources of incomplete information for temporal reasoning in natural language applications, we have constructed an annotated corpus to extract the implicit and vague event durations from text. We generated annotation guidelines, categorized the event classes to reduce gross discrepancies in inter-annotator judgments, used normal distributions to model event duration annotations that are intervals on a scale and to measure their inter-annotator agreement. Machine learning techniques were then applied to the annotated data and produced coarse-grained event duration information automatically, considerably outperforming a baseline and approaching human performance. The methods used here should be applicable to other kinds of vague but substantive information.