Conceptual structures: information processing in mind and machine
Conceptual structures: information processing in mind and machine
Algorithms for clustering data
Algorithms for clustering data
The Johnson-Lindenstrauss Lemma and the sphericity of some graphs
Journal of Combinatorial Theory Series A
Automatic text processing
Information retrieval: data structures and algorithms
Information retrieval: data structures and algorithms
Elements of information theory
Elements of information theory
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Constant interaction-time scatter/gather browsing of very large document collections
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
An extended vector-processing scheme for searching information in hypertext systems
Information Processing and Management: an International Journal
Automatic hypertext link typing
Proceedings of the the seventh ACM conference on Hypertext
Projections for efficient document clustering
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Lore: a database management system for semistructured data
ACM SIGMOD Record
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The connectivity server: fast access to linkage information on the Web
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Making large-scale support vector machine learning practical
Advances in kernel methods
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
The role of lexicalization and pruning for base noun phrase grammars
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Natural Language Processing in LISP: An Introduction to Computational Linguistics
Natural Language Processing in LISP: An Introduction to Computational Linguistics
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Bayesian Networks for Data Mining
Data Mining and Knowledge Discovery
Mining the Web's Link Structure
Computer
First-Order Learning for Web Mining
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Using Reinforcement Learning to Spider the Web Efficiently
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
What Do Those Weird XML Types Want, Anyway?
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Using Taxonomy, Discriminants, and Signatures for Navigating in Text Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Machine Learning Approach to Building Domain-Specific Search Engines
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
The VLDB Journal — The International Journal on Very Large Data Bases
Hypertext versions of journal articles: computer-aided linking and realistic human-based evaluation
Hypertext versions of journal articles: computer-aided linking and realistic human-based evaluation
Language As a Cognitive Process: Syntax
Language As a Cognitive Process: Syntax
ACM SIGKDD Explorations Newsletter
Concept-based knowledge discovery in texts extracted from the Web
ACM SIGKDD Explorations Newsletter
Resource description framework: metadata and its applications
ACM SIGKDD Explorations Newsletter
Proceedings of the 11th international conference on World Wide Web
Data-driven evolution of data mining algorithms
Communications of the ACM - Evolving data mining into solutions for insights
Machines that learn to play games
Improving WWW Access-from Single-Purpose Systems to Agent Architectures?
AIMSA '00 Proceedings of the 9th International Conference on Artificial Intelligence: Methodology, Systems, and Applications
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
Frequent term-based text clustering
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining newsgroups using networks arising from social behavior
WWW '03 Proceedings of the 12th international conference on World Wide Web
Web Usage Mining as a Tool for Personalization: A Survey
User Modeling and User-Adapted Interaction
A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
LinkSelector: A Web mining approach to hyperlink selection for Web portals
ACM Transactions on Internet Technology (TOIT)
Average-clicks: a new measure of distance on the World Wide Web
Journal of Intelligent Information Systems - Special issue on web intelligence
Web Mining: Research and Practice
Computing in Science and Engineering
Designing a better web portal for digital government: a web-mining based approach
dg.o '05 Proceedings of the 2005 national conference on Digital government research
A process of knowledge discovery from web log data: Systematization and critical review
Journal of Intelligent Information Systems
Neural Network Based Document Clustering Using WordNet Ontologies
International Journal of Hybrid Intelligent Systems
Web outlier mining: Discovering outliers from web datasets
Intelligent Data Analysis
ServiceFinder: A method towards enhancing service portals
ACM Transactions on Information Systems (TOIS)
TaxaMiner: an experimentation framework for automated taxonomy bootstrapping
International Journal of Web and Grid Services
Clustering techniques utilized in web usage mining
AIKED'06 Proceedings of the 5th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases
A new algorithm for term weighting in text summarization process
AIC'06 Proceedings of the 6th WSEAS International Conference on Applied Informatics and Communications
Web Usage Mining Via Fuzzy Logic Techniques
IFSA '07 Proceedings of the 12th international Fuzzy Systems Association world congress on Foundations of Fuzzy Logic and Soft Computing
Metadata domain-knowledge driven search engine in "HyperManyMedia" E-learning resources
CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Class dependent feature scaling method using naive Bayes classifier for text datamining
Pattern Recognition Letters
Web site topic-hierarchy generation based on link structure
Journal of the American Society for Information Science and Technology
Adaptive Web SitesA Knowledge Extraction from Web Data Approach
Proceedings of the 2008 conference on Adaptive Web Sites: A Knowledge Extraction from Web Data Approach
Intent based clustering of search engine query log
CASE'09 Proceedings of the fifth annual IEEE international conference on Automation science and engineering
Metadata as seeds for building an ontology driven information retrieval system
International Journal of Hybrid Intelligent Systems
Web Semantics: Science, Services and Agents on the World Wide Web
Framework for building a high-quality web page collection considering page group structure
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Analysis of log files applying mining techniques and fuzzy logic
IEA/AIE'07 Proceedings of the 20th international conference on Industrial, engineering, and other applications of applied intelligent systems
Combining resemblance functions for ontology alignment
Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services
A solution to the exact match on rare item searches: introducing the lost sheep algorithm
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Dynamic resource scheduling and workflow management in cloud computing
WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
Expert Systems with Applications: An International Journal
Using SOFM to improve web site text content
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part II
Fuzzy-neuro web-based multilingual knowledge management
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
A personalized multilingual web content miner: PMWebMiner
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
Web site off-line structure reconfiguration: a web user browsing analysis
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Hybrid approach to web content outlier mining without query vector
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Conceptual classification to improve a web site content
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Towards automatic assessment of government web sites
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Hi-index | 0.00 |
With over 800 million pages covering most areas of human endeavor, the World-wide Web is a fertile ground for data mining research to make a difference to the effectiveness of information search. Today, Web surfers access the Web through two dominant interfaces: clicking on hyperlinks and searching via keyword queries. This process is often tentative and unsatisfactory. Better support is needed for expressing one's information need and dealing with a search result in more structured ways than available now. Data mining and machine learning have significant roles to play towards this end.In this paper we will survey recent advances in learning and mining problems related to hypertext in general and the Web in particular. We will review the continuum of supervised to semi-supervised to unsupervised learning problems, highlight the specific challenges which distinguish data mining in the hypertext domain from data mining in the context of data warehouses, and summarize the key areas of recent and ongoing research.