Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop - Volume 3

  • Authors:
  • Marti Hearst;Mari Ostendorf

  • Affiliations:
  • -;-

  • Venue:
  • NAACLstudent '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop - Volume 3
  • Year:
  • 2003
  • An abstractive approach to sentence compression

    ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction

Quantified Score

Hi-index 0.00

Visualization

Abstract

One morning each of us received a phone call from Ed Hovy. "Areyou sitting down?" he asked. He told us that as a way to combatconference overload, and to promote interaction among communities,a joint conference had been proposed to combine HLT and NAACL. Adiverse oversight committee had been formed, and according to Ed,this committee had been able to agree on two people -- and only twopeople -- as program co-chairs, because together we represented allof the vested interests. Marti was meant to represent the standardsand tastes of the NAACL and the SIGIR crowds, and Mari the speechcommunity, and both have been working on research contracts withHLT funders. Ed told us that if either of us said no, the entireenterprise would come crashing down. There are few better ways toconvince busy people to become program co-chairs.Throughout the process, Ed provided the vision for and the drivebehind this conference. We salute him for making this idea areality, and for his enthusiastic and energetic phone calls thatkept everything going.This is an exciting time for research in human languagetechnologies. After years of relative calm, the field seemssuddenly to be moving by leaps and bounds. Evidence of this can befound in our conference panel on "Preparing for a SurpriseLanguage" (and as embodied in the short paper "Desperately SeekingCebuano"). This panel will discuss the experiences of severalgroups of researchers, who at the behest of DARPA, acquired anddeveloped language resources for an entirely new language within aspan of only 10 days. This experiment took place in March of 2003,and the language in question was Cebuano, a language spoken in thePhilippines. Participants successfully collected a large body oflexical and textual resources and developed a range of tools,including stemmers and POS taggers. (In June, DARPA will announce anew surprise language.)The existence of a variety of language resources, combined withadvances in statistical analysis and modeling techniques, isresulting in fast-paced improvements in the field. Statisticalparsers can now produce syntax trees for long sentences with highaccuracy and great speed. Advances are starting to be made inautomated semantic analysis. Great strides are being made in thesophistication and coverage of question answering systems. Speechrecognition systems have achieved suficiently high accuracy that itis now possible to do retrieval, information extraction and topictracking on spoken documents. Large and growing collections of textand speech corpora -- and the promise of much more from the web --have enabled many of these advances. New developments in weaklysupervised and unsupervised learning algorithms are critical fortaking advantage of many new data sources, and hence this waschosen as a special theme of the conference. Lexical resources suchas FrameNet, WordNet, PropBank, MeSH, and the Penn TreeBank alsoplay prominent roles in HLT advances.As a field, human language technologies research should use, asmotivation and guide, an understanding of the linguistic andcognitive bases of language. The invited talk by Dr. ElissaNewport, entitled "Statistical language learning: Mechanisms forlanguage acquisition in human learners," should help enlighten thecommunity by informing us about the latest in psycholinguisticresearch.We received 162 submissions for full papers, of which 37 wereaccepted, resulting in a highly competitive acceptance rate of 22%.For the short (late-breaking) papers track, we received 80submissions, of which 41 were accepted (2 later withdrawn). Some ofthese will be presented as short talks, and others as posters.Seventeen demonstrations will be shown.We were fortunate to be able to accept 15 papers that addressedthe conference theme of unsupervised and weakly supervised methods.We also encouraged papers that described techniques that cross overor combine NLP, speech and/or IR, and several of the papersdemonstrate this kind of crossover.The full paper reviewing was done using a two-tier system.First, two first-tier reviewers read every paper. Then a thirdreviewer, known as the meta-reviewer, wrote their own review.Finally, the meta-reviewer summarized these reviews and introducedadditional comments. In some cases, the meta-reviewer instigateddiscussion among the first-tier reviewers to work out controversialissues. The meta-reviewers also attended the program committeemeeting in which all the papers were discussed and acceptances weredecided. For the short papers, each short paper received at leasttwo reviews. Those papers whose reviewers disagreed, or whichreceived middling scores, were subsequently reviewed by a member ofthe program committee and the program co-chairs. Paper submissionand reviewing was done online using Marti's conference reviewingsoftware (Conga), which she updated for this conference. Marti alsomaintained the conference website.