Back to basics: What is Data Mining? - My Datafication

25 September, 2018

Back to basics: What is Data Mining?


Definition

Data Mining is the process of discovering patterns and generate new valuable information from large data sets to solve problems and support decision making. It uses methodologies and techniques from the intersection of data management, statistics, and machine learning to identify previously unknown patterns, classify and group data and summarize previously unknown relationships.

Data Mining Most Useful Techniques


1. Clustering
A descriptive data mining technique which aims to group data objects, so that data in the same cluster are similar to one another and dissimilar to the objects in other clusters.

2. Classification
A predictive data mining technique that assigns items of a collection to target classes, i.e. categories. The goal of classification is to accurately predict the target class for each case in the data.

3. Regression
A predictive data mining technique predict a continuous variable, e.g. stock price, given a particular data set. Regression and classification are used to solve similar problems, but they are frequently confused. Both are predictive data mining techniques, but regression is used to predict a numeric or continuous value while classification assigns data into discrete categories, i.e. predicts the bucket the data objects falls into.

4. Association Rules
A rule-based descriptive data mining technique that explores the given data set and finds frequent patterns, correlations, associations, or causal structures. Given a set of transactions, association rule mining looks for rules to predict the occurrence of a specific item based on the occurrences of the other items in the transaction.

The above techniques can be grouped in two big categories Supervised learning (Classification and Regression) and Unsupervised learning (Clustering and Association Rules) which differ on how they process data. Supervised learning algorithms are trained to learn the mapping function that can use the input to produce the output. On the other hand, Unsupervised learning algorithms have only input data, and no output. The goal of unsupervised learning is to model the underlying structure or distribution of the data to identify hidden patterns and extract previously unknown knowledge.

Data Mining Tools

1. Rapidminer
2. R
3. Python
3. Weka
4. KNIME
5. Microsoft Analysis Services

and many more...

You can find here more interesting definitions every data scientist should know! If you have any topic or definitions you would like to hear about, just leave a comment below. If you like the blog, don't forget to "Like" the page on Facebook to keep up-to-date with the new posts.

Bibliography

Data Science Central
Towards Data Science
Department of Statistics, Columbia University


19 comments:

  1. This is a broad scope of dialects and toolboxs utilized by Data Scientists. data science course in pune

    ReplyDelete
  2. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple linear regression
    data science interview questions

    ReplyDelete
  3. After reading your article I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article.

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  4. You guardians do an astounding web diary, and have some unfathomable substance. Continue doing extraordinary. bookkeeper data entry

    ReplyDelete
  5. What I discovered was huge numbers of these projects had exceptionally deceptive promotions and sites with unbelievable guarantees. data entry assistant

    ReplyDelete
  6. Many people who purchase the click bank data entry product fail and data entry outsource companies give up with data entry jobs. It takes some time to realize that they were actually scammed and lost their time and money.

    ReplyDelete
  7. I was taking a gander at some of your posts on this site and I consider this site is truly informational! Keep setting up.. receipt data entry

    ReplyDelete
  8. As of late there have been accounts of individuals in high expansion nations, for example, Zimbabwe purchasing Bitcoin so as to clutch what riches they have instead of see its worth decrease under the foolishness of its focal financial framework. bitcoin mixer

    ReplyDelete
  9. The group at SNO Coins knows about the expectation to absorb information related with purchasing and utilizing SNO Coins for the new clients. coin master

    ReplyDelete
  10. This Was An Amazing ! I Haven't Seen This Type of Blog Ever ! Thankyou For Sharing data science course in Hyderabad

    ReplyDelete
  11. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written. Mining Management System

    ReplyDelete
  12. Thanks for a very interesting blog. What else may I get that kind of info written in such a perfect approach? I’ve a undertaking that I am simply now operating on, and I have been at the look out for such info. Trezor vs ledger 2021

    ReplyDelete
  13. Nice post! This is a very nice blog that I will definitively come back to more times this year! Thanks for informative post. Goldshell LT5 Litecoin Miner

    ReplyDelete
  14. Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon. Big thanks for the useful info. Goldshell LT5

    ReplyDelete
  15. I really like your take on the issue. I now have a clear idea on what this matter is all about.. cheap wow gold eu

    ReplyDelete
  16. Wonderful blog found to be very impressive to come across such an awesome blog. I should really appreciate the blogger for the efforts they have put in to develop such an amazing content for all the curious readers who are very keen of being updated across every corner. Ultimately, this is an awesome experience for the readers. Anyways, thanks a lot and keep sharing the content in future too.

    data science institute in bangalore

    ReplyDelete
  17. I think about it is most required for making more on this get engaged
    data scientist training and placement

    ReplyDelete