Towards smarter documents

  • Authors:
  • Vikas Krishna;Prasad M. Deshpande;Savitha Srinivasan

  • Affiliations:
  • IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA;IBM Almaden Research Center, San Jose, CA

  • Venue:
  • Proceedings of the thirteenth ACM international conference on Information and knowledge management
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document analysis research typically focuses on document image understanding or classic problems in text classification, clustering, summarization and discovery. While that is an important aspect of document management, in practice, documents lifecycles are often determined by the context of the business process that they are relevant to. It therefore becomes necessary for the document analysis techniques to recognize and leverage the contextual information provided by a supporting schema and business process. This paper presents an intelligent document management framework with relevant document analysis, metadata extraction, and business process association algorithms and methodology. The architecture supporting this framework seamlessly integrates a runtime environment with an authoring environment by combining relational data modeling tools with document classification techniques. The runtime environment accepts incoming documents, classifies the document, extracts metadata and executes customized business logic. The authoring environment supports the association of a class of documents with a relational document schema, identification of attribute values that must be extracted automatically, generation of relevant business logic, and deployment of authoring artifacts into the runtime architecture. We demonstrate the use of this framework with representative real-world document transformative applications.