Saturday, 7 January 2017

Data Mining

Data Mining

Data mining is the retrieving of hidden information from data using algorithms. Data mining helps to extract useful information from great masses of data, which can be used for making practical interpretations for business decision-making. It is basically a technical and mathematical process that involves the use of software and specially designed programs. Data mining is thus also known as Knowledge Discovery in Databases (KDD) since it involves searching for implicit information in large databases. The main kinds of data mining software are: clustering and segmentation software, statistical analysis software, text analysis, mining and information retrieval software and visualization software.

Data mining is gaining a lot of importance because of its vast applicability. It is being used increasingly in business applications for understanding and then predicting valuable information, like customer buying behavior and buying trends, profiles of customers, industry analysis, etc. It is basically an extension of some statistical methods like regression. However, the use of some advanced technologies makes it a decision making tool as well. Some advanced data mining tools can perform database integration, automated model scoring, exporting models to other applications, business templates, incorporating financial information, computing target columns, and more.

Some of the main applications of data mining are in direct marketing, e-commerce, customer relationship management, healthcare, the oil and gas industry, scientific tests, genetics, telecommunications, financial services and utilities. The different kinds of data are: text mining, web mining, social networks data mining, relational databases, pictorial data mining, audio data mining and video data mining.

Some of the most popular data mining tools are: decision trees, information gain, probability, probability density functions, Gaussians, maximum likelihood estimation, Gaussian Baves classification, cross-validation, neural networks, instance-based learning /case-based/ memory-based/non-parametric, regression algorithms, Bayesian networks, Gaussian mixture models, K-Means and hierarchical clustering, Markov models, support vector machines, game tree search and alpha-beta search algorithms, game theory, artificial intelligence, A-star heuristic search, HillClimbing, simulated annealing and genetic algorithms.

Some popular data mining software includes: Connexor Machines, Copernic Summarizer, Corpora, DocMINER, DolphinSearch, dtSearch, DS Dataset, Enkata, Entrieva, Files Search Assistant, FreeText Software Technologies, Intellexer, Insightful InFact, Inxight, ISYS:desktop, Klarity (part of Intology tools), Leximancer, Lextek Onix Toolkit, Lextek Profiling Engine, Megaputer Text Analyst, Monarch, Recommind MindServer, SAS Text Miner, SPSS LexiQuest, SPSS Text Mining for Clementine, Temis-Group, TeSSI®, Textalyser, TextPipe Pro, TextQuest, Readware, Quenza, VantagePoint, VisualText(TM), by TextAI, Wordstat. There is also free software and shareware such as INTEXT, S-EM (Spy-EM), and Vivisimo/Clusty.

Source :

No comments:

Post a Comment