Automatic segmentation of text into structured records
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A Data Integration Framework for e-Commerce Product Classification
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms
Data Mining and Knowledge Discovery
A Survey of Outlier Detection Methodologies
Artificial Intelligence Review
Lucene in Action (In Action series)
Lucene in Action (In Action series)
Text mining for product attribute extraction
ACM SIGKDD Explorations Newsletter
Data Mining techniques for the detection of fraudulent financial statements
Expert Systems with Applications: An International Journal
Uses of artificial intelligence in the Brazilian customs fraud detection system
dg.o '08 Proceedings of the 2008 international conference on Digital government research
Detecting outlier sections in us congressional legislation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
With the growing importance of foreign commerce comes also greater opportunities for fraudulent behaviour. As such, governments must try to detect frauds as soon as they take place, if they are to avoid the profound damage to the society frauds may cause. Although current fraud detection systems can be used on this endeavour with reasonable accuracy, they still suffer with the inconsistencies and ambiguities of unstructured databases, especially in customs. To deal with this kind of problem, we propose a twofold approach: building a brand new structured database, keeping it as clean as possible; and mining the current database for the desired information. Then, as a first contribution, we present a methodology for mining product attribute-value pairs in unstructured text datasets, bringing more structure to the current customs database. Next, as our second contribution, we introduce a system for building a structured database for the Brazilian customs and keeping it with as few redundancies as possible. Both systems aim at building datasets capable of improving the accuracy of fraud detection systems.