Regression testing for wrapper maintenance
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Data extraction and label assignment for web databases
WWW '03 Proceedings of the 12th international conference on World Wide Web
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Reverse Method for Labeling the Information from Semi-Structured Web Pages
ICSPS '09 Proceedings of the 2009 International Conference on Signal Processing Systems
SSME '09 Proceedings of the 2009 IITA International Conference on Services Science, Management and Engineering
Artifacts extraction technique
CIT'09 Proceedings of the 3rd International Conference on Communications and information technology
CIT'09 Proceedings of the 3rd International Conference on Communications and information technology
Text analytics as a form of knowledge mining
ECC'10 Proceedings of the 4th conference on European computing conference
Automated internal web page clustering for improved data extraction
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Hi-index | 0.00 |
Message boards are part of the Internet known as the 'Invisible Web' and pose many problems to traditional search engine spiders. The dynamic content is usually very deep and difficult to search. In addition, many of these sites change their locations, servers, or URLs almost daily creating problems with the indexing process. However, during the growth of the World Wide Web and with the help of search engines, they represent an important source of information to solve different problems. Another interesting feature of this type of web pages is that a big community has been developed, expressing different opinions and discussing various topics. Using special retrieval and indexing algorithms, mostly based on the HTML DOM tree, we have developed an algorithm to obtain detailed and accurate trend statistics that can be used for different marketing solutions and analysis tools. Combined with the services provided by traffic ranking sites like Alexa.com, we can also provide geo targeting functionality to deliver even more accurate results to the end user, such as what percentage of the users who are visiting a certain forum is coming from a certain country.