Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Building application frameworks: object-oriented foundations of framework design
Building application frameworks: object-oriented foundations of framework design
Machine Learning
QuASM: a system for question answering using semi-structured data
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Designing Software Product Lines with UML: From Use Cases to Pattern-Based Software Architectures
Designing Software Product Lines with UML: From Use Cases to Pattern-Based Software Architectures
Communications of the ACM - The Blogosphere
How blogging software reshapes the online community
Communications of the ACM - The Blogosphere
Survey of Improving Naive Bayes for Classification
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Text Extraction from the Web via Text-to-Tag Ratio
DEXA '08 Proceedings of the 2008 19th International Conference on Database and Expert Systems Application
A computational model for developing semantic web-based educational systems
Knowledge-Based Systems
An effective refinement strategy for KNN text classifier
Expert Systems with Applications: An International Journal
Boilerplate detection using shallow text features
Proceedings of the third ACM international conference on Web search and data mining
Levenshtein Distance: Information theory, Computer science, String (computer science), String metric, Damerau?Levenshtein distance, Spell checker, Hamming distance
Hi-index | 0.00 |
Blogs have become interesting tools for knowledge generation and sharing. As a matter of fact, the activity on blogs doubles every two hundred days. Numerous applications could make use of this massive daily information in order to find out interesting interpretations. However, the dynamic nature of the blogosphere hinders the manual information extraction from it, promoting the development of new automated approaches. In this paper, we propose a component-based framework to create blog crawlers based on software architecture. This framework provides useful services for the blog analysis, including preprocessing, indexing, content extraction, classification, and tag recommendation. In addition, we report a case study represented by a blog recommendation system, which helps student interactions in educational forums. This research work also aims to demonstrate the effort reduction when creating an application for blog analysis caused by the proposed framework. Finally other aspects of the developed application, such as the system evolution impact, reusability, and instantiation cost are qualitatively discussed.