Data Mining - Definition

What is data mining? 

Data mining is the scouring of large data repositories through various techniques in search of useful patterns. Data mining can be used for:

  1. Predicting an outcome
  2. Identifying underlying patterns in data 
There is no set method on how to approach data mining. However, at minimum, a data miner or data mining team should be able to understand the business operation or process being analyzed, prepare and manipulate available data and reduce the number of variables in addition to identifying the optimal variables for use in the data mining techniques/algorithms.
 
A great reference on how to approach a project is CRISP-DM. Their methodology is not industry specific or software specific. Their user guide is free for download at http://www.crisp-dm.org
 
Initially this section will show you how to use some of the tools commonly used in and during the data mining process.  At some point in the future, actual examples may be presented.
 
Tools presented: SAS and SPSS Clementine
 
SAS PC 9.1 Splash Screen