Genres and the Web: is the personal home page the first uniquely digital genre?
Journal of the American Society for Information Science
Integrating automatic genre analysis into digital libraries
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
The Home Page as Genre: A Narrative Approach
HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences - Volume 2
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
A non-projective dependency parser
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic detection of text genre
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Recognizing text genres with simple metrics using discriminant analysis
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Text genre detection using common word frequencies
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
The SPIRIT collection: an overview of a large web collection
ACM SIGIR Forum
Automatic Identification of Home Pages on the Web
HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 4 - Volume 04
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Implementing a characterization of genre for automatic genre identification of web pages
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Improving retrieval accuracy by weighting document types with clickthrough data
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
X-Site: a workplace search tool for software engineers
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Classifying XML Documents by Using Genre Features
DEXA '07 Proceedings of the 18th International Conference on Database and Expert Systems Applications
Searching documents based on relevance and type
ECIR'07 Proceedings of the 29th European conference on IR research
A Bayesian approach for learning document type relevance
ECIR'07 Proceedings of the 29th European conference on IR research
Automatic genre detection of web documents
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Structured text retrieval by means of affordances and genre
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Enhance web pages genre identification using neighboring pages
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Hi-index | 0.00 |
This paper presents an automatic genre classification model that implements a flexible classification scheme, i.e. a scheme capable of performing zero-, one- or multi-genre assignment. I suggest that this scheme is more appropriate for genres on the web, because many web pages have often more than one genre or none at all. The model that I propose relies on the distinction between the concepts of 'text types' and 'genre', which are both 'inferred' and not 'learned' from pre-labelled examples. The main drawback of this approach is that it cannot be fully evaluated given the limitations of current genre research. However, I present a partial evaluation that shows that the model performs competitively, and remains stable when re-scaled.