Data Mining

Razvan Andonie

Overview:

The old adage "Knowledge is power", also applies to the very hot application areas like e-commerce and Internet knowledge processing. The technologies for generating and collecting data have been advancing rapidly. At the current stage, lack of data is no longer a problem; the inability to generate useful information from data is! The explosive growth in data and database results in the need to develop new technologies and tools to process data into useful information and knowledge intelligently and automatically.

Data mining (DM), therefore, has become a research area with increasing importance. DM is the search for valuable information in large volumes of data. It is the process of nontrivial extraction of implicit, previously unknown and potentially useful information such as knowledge rules, constraints, and regularities from data stored in repositories using pattern recognition technologies as well as statistical and mathematical techniques. Many companies have recognized DM as an important technique that will have an impact on the performance of the companies. DM is an active research area and research is ongoing to bring statistical analysis and artificial intelligence (AI) techniques together to address the issues. DM technology is helping business everywhere to work smarter by revealing unknown patterns within existing archives.

Contents:

This course offers a coverage of the recent advances in the application of soft computing to DM and knowledge discovery databases. It focuses on some of the hardest, and yet unsolved, issues of data mining like understandability of patterns, finding complex relationships between attributes, handling missing and noisy data, mining very large datasets, change detection in time series, and integration of the discovery process with database management systems.

Prerequisites:

*      Computational Intelligence

Bibliography:

*      Abe, S. Pattern Classification - Neuro-fuzzy methods and Their Comparison, Springer, London, 2001.

*      Communications of the ACM, August 2002, Volume 45, Number 8.

*      Kantardzic, M. Data Mining: Concepts, Models, Methods, and Algorithms, Wiley-Interscience and IEEE Press, 2003.

*      Chen, Z., Data Mining and Uncertain Reasoning, John Wiley, New York, 2001

*      Cios, K., W. Pedrycz, and Swiniarski, R., Data Mining: Methods for Knowledge Discovery, Kluwer,1998.

Links:

*      KDnuggets Data Mining Guide

*      Software Suites for Data Mining

*      Machine Learning Datasets

*      Time Series Data Library

*      NEFCLASS and other neuro-fuzzy products (University of Magdeburg)

*      Kovalerchuk's page on Data Mining in Finance