Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods

Ebook306 pages2 hours

Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods

Name: Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods
Author: Hong Zhou
ISBN: 9781484259825

By Hong Zhou

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Use popular data mining techniques in Microsoft Excel to better understand machine learning methods.

Software tools and programming language packages take data input and deliver data mining results directly, presenting no insight on working mechanics and creating a chasm between input and output. This is where Excel can help.

Excel allows you to work with data in a transparent manner. When you open an Excel file, data is visible immediately and you can work with it directly. Intermediate results can be examined while you are conducting your mining task, offering a deeper understanding of how data is manipulated and results are obtained. These are critical aspects of the model construction process that are hidden in software tools and programming language packages.

This book teaches you data mining through Excel. You will learn how Excel has an advantage in data mining when the data sets are not too large. It can give you a visual representation of data mining, building confidence in your results. You will go through every step manually, which offers not only an active learning experience, but teaches you how the mining process works and how to find the internal hidden patterns inside the data.

What You Will Learn

Comprehend data mining using a visual step-by-step approach
Build on a theoretical introduction of a data mining method, followed by an Excel implementation
Unveil the mystery behind machine learning algorithms, making a complex topic accessible to everyone
Become skilled in creative uses of Excel formulas and functions
Obtain hands-on experience with data mining and Excel

Who This Book Is For
Anyone who is interested in learning data mining or machine learning, especially data science visual learners and people skilled in Excel, who would like to explore data science topics and/or expand their Excel skills. A basic or beginner level understanding of Excel is recommended.

Skip carousel

LanguageEnglish

PublisherApress

Release dateJun 13, 2020

ISBN9781484259825

Author

Hong Zhou

Related authors

Skip carousel

Related to Learn Data Mining Through Excel

Related ebooks

Skip carousel

Dynamic SQL: Applications, Performance, and Security in Microsoft SQL Server
Ebook
Dynamic SQL: Applications, Performance, and Security in Microsoft SQL Server
byEdward Pollack
Rating: 0 out of 5 stars
0 ratings
Mastering Excel Through Projects: A Learn-by-Doing Approach from Payroll to Crypto to Data Analysis
Ebook
Mastering Excel Through Projects: A Learn-by-Doing Approach from Payroll to Crypto to Data Analysis
byHong Zhou
Rating: 0 out of 5 stars
0 ratings
Learn Java with Math: Using Fun Projects and Games
Ebook
Learn Java with Math: Using Fun Projects and Games
byRon Dai
Rating: 0 out of 5 stars
0 ratings
A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics
Ebook
A Python Data Analyst’s Toolkit: Learn Python and Python-based Libraries with Applications in Data Analysis and Statistics
byGayathri Rajagopalan
Rating: 0 out of 5 stars
0 ratings
Scala Programming for Big Data Analytics: Get Started With Big Data Analytics Using Apache Spark
Ebook
Scala Programming for Big Data Analytics: Get Started With Big Data Analytics Using Apache Spark
byIrfan Elahi
Rating: 0 out of 5 stars
0 ratings
Beginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map
Ebook
Beginning Power BI with Excel 2013: Self-Service Business Intelligence Using Power Pivot, Power View, Power Query, and Power Map
byDan Clark
Rating: 0 out of 5 stars
0 ratings
Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python
Ebook
Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python
byDavid Paper
Rating: 0 out of 5 stars
0 ratings
Power Query for Power BI and Excel
Ebook
Power Query for Power BI and Excel
byChristopher Webb
Rating: 0 out of 5 stars
0 ratings
Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
Ebook
Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
byPramod Singh
Rating: 0 out of 5 stars
0 ratings
Practical MATLAB: With Modeling, Simulation, and Processing Projects
Ebook
Practical MATLAB: With Modeling, Simulation, and Processing Projects
byIrfan Turk
Rating: 0 out of 5 stars
0 ratings
Mastering Excel: Mastering Software Series, #1
Ebook
Mastering Excel: Mastering Software Series, #1
byPeter Adams
Rating: 0 out of 5 stars
0 ratings
Learn PySpark: Build Python-based Machine Learning and Deep Learning Models
Ebook
Learn PySpark: Build Python-based Machine Learning and Deep Learning Models
byPramod Singh
Rating: 0 out of 5 stars
0 ratings
Python for Probability, Statistics, and Machine Learning
Ebook
Python for Probability, Statistics, and Machine Learning
byJosé Unpingco
Rating: 0 out of 5 stars
0 ratings
Automated Theorem Proving in Software Engineering
Ebook
Automated Theorem Proving in Software Engineering
byJohann M. Schumann
Rating: 0 out of 5 stars
0 ratings
MATLAB Optimization Techniques
Ebook
MATLAB Optimization Techniques
byCesar Lopez
Rating: 0 out of 5 stars
0 ratings
Handbook of Human Centric Visualization
Ebook
Handbook of Human Centric Visualization
byWeidong Huang
Rating: 0 out of 5 stars
0 ratings
Predictive Lead Scoring Standard Requirements
Ebook
Predictive Lead Scoring Standard Requirements
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Beginning Security with Microsoft Technologies: Protecting Office 365, Devices, and Data
Ebook
Beginning Security with Microsoft Technologies: Protecting Office 365, Devices, and Data
byVasantha Lakshmi
Rating: 0 out of 5 stars
0 ratings
Microsoft Dynamics CRM 2011 Customization & Configuration (MB2-866) Certification Guide
Ebook
Microsoft Dynamics CRM 2011 Customization & Configuration (MB2-866) Certification Guide
byNeil Benson
Rating: 0 out of 5 stars
0 ratings
Data Governance and Data Management: Contextualizing Data Governance Drivers, Technologies, and Tools
Ebook
Data Governance and Data Management: Contextualizing Data Governance Drivers, Technologies, and Tools
byRupa Mahanti
Rating: 0 out of 5 stars
0 ratings
.NET DevOps for Azure: A Developer's Guide to DevOps Architecture the Right Way
Ebook
.NET DevOps for Azure: A Developer's Guide to DevOps Architecture the Right Way
byJeffrey Palermo
Rating: 0 out of 5 stars
0 ratings
Pro PowerShell for Amazon Web Services: DevOps for the AWS Cloud
Ebook
Pro PowerShell for Amazon Web Services: DevOps for the AWS Cloud
byBrian Beach
Rating: 0 out of 5 stars
0 ratings
Embedded Software Design and Programming of Multiprocessor System-on-Chip: Simulink and System C Case Studies
Ebook
Embedded Software Design and Programming of Multiprocessor System-on-Chip: Simulink and System C Case Studies
byKatalin Popovici
Rating: 0 out of 5 stars
0 ratings
4D Printing Complete Self-Assessment Guide
Ebook
4D Printing Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Python for Marketing Research and Analytics
Ebook
Python for Marketing Research and Analytics
byJason S. Schwarz
Rating: 0 out of 5 stars
0 ratings
No-Code Data Science: Mastering Advanced Analytics, Machine Learning, and Artificial Intelligence
Ebook
No-Code Data Science: Mastering Advanced Analytics, Machine Learning, and Artificial Intelligence
byDavid Patrishkoff
Rating: 0 out of 5 stars
0 ratings
Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
Ebook
Assessing and Improving Prediction and Classification: Theory and Algorithms in C++
byTimothy Masters
Rating: 0 out of 5 stars
0 ratings
Pro Power BI Theme Creation: JSON Stylesheets for Automated Dashboard Formatting
Ebook
Pro Power BI Theme Creation: JSON Stylesheets for Automated Dashboard Formatting
byAdam Aspin
Rating: 0 out of 5 stars
0 ratings
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
Ebook
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud
byRobert Ilijason
Rating: 0 out of 5 stars
0 ratings
Sensor fusion Standard Requirements
Ebook
Sensor fusion Standard Requirements
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
Ebook
Python: For Beginners A Crash Course Guide To Learn Python in 1 Week
byTimothy C. Needham
Rating: 4 out of 5 stars
4/5
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
Ebook
Java for Beginners: A Crash Course to Learn Java Programming in 1 Week
byBrady Ellison
Rating: 5 out of 5 stars
5/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]: Career Elevator
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Python: Learn Python in 24 Hours
Ebook
Python: Learn Python in 24 Hours
byAlex Nordeen
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
Ebook
HTML & CSS QuickStart Guide: The Simplified Beginners Guide to Developing a Strong Coding Foundation, Building Responsive Websites, and Mastering the Fundamentals of Modern Web Design
byDavid DuRocher
Rating: 4 out of 5 stars
4/5
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
Ebook
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
byJoseph Labrecque
Rating: 5 out of 5 stars
5/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
Ebook
Python Machine Learning - Third Edition: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow 2, 3rd Edition
bySebastian Raschka
Rating: 5 out of 5 stars
5/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
HTML & CSS: Learn the Fundaments in 7 Days
Ebook
HTML & CSS: Learn the Fundaments in 7 Days
byMichael Knapp
Rating: 4 out of 5 stars
4/5
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
Ebook
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
byHeath Haskins
Rating: 5 out of 5 stars
5/5
Python Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming
Ebook
Python Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming
byConnor P. Milliken
Rating: 0 out of 5 stars
0 ratings
Learn JavaScript in 24 Hours
Ebook
Learn JavaScript in 24 Hours
byAlex Nordeen
Rating: 3 out of 5 stars
3/5
SQL All-in-One For Dummies
Ebook
SQL All-in-One For Dummies
byAllen G. Taylor
Rating: 3 out of 5 stars
3/5
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
Ebook
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
byFlynn Fisher
Rating: 5 out of 5 stars
5/5
Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS
Ebook
Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS
byTravis Plunk
Rating: 0 out of 5 stars
0 ratings
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
C++ Learn in 24 Hours
Ebook
C++ Learn in 24 Hours
byAlex Nordeen
Rating: 0 out of 5 stars
0 ratings
Unreal Engine from Zero to Proficiency (Foundations): Unreal Engine from Zero to Proficiency, #1
Ebook
Unreal Engine from Zero to Proficiency (Foundations): Unreal Engine from Zero to Proficiency, #1
byPatrick Felicia
Rating: 0 out of 5 stars
0 ratings
HTML in 30 Pages
Ebook
HTML in 30 Pages
byU.Q. Magnusson
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
WYC 154 – The Struggle of Self-Confidence – Riley Tincher – ‘You are more than an athlete’: Riley's bio: I am a former All-American pitcher at UW-Whitewater. I am now a Mental Conditioning Coach (Master’s Degree in Sport Psychology), Author, and Speaker. I own a mentorship program called Coachability,
Podcast episode
WYC 154 – The Struggle of Self-Confidence – Riley Tincher – ‘You are more than an athlete’: Riley's bio: I am a former All-American pitcher at UW-Whitewater. I am now a Mental Conditioning Coach (Master’s Degree in Sport Psychology), Author, and Speaker. I own a mentorship program called Coachability,
byThe Winning Youth Coaching Podcast: Youth Sports | Coaching | Parenting | Family Resources
0 ratings
0% found this document useful
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
Podcast episode
Every commit is a gift: celebrating Maintainer Week with Brett Cannon
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
013: The Best Excel Tips of 2016 from 23 Excel Experts [Christmas Special]: In this Christmas Special podcast episode, I gather 22 Excel experts from around the world to share their best tip of the year! At the end of the podcast episode, there is a great Excel song created by Clint Tuttle called "Can't Stop...
Podcast episode
013: The Best Excel Tips of 2016 from 23 Excel Experts [Christmas Special]: In this Christmas Special podcast episode, I gather 22 Excel experts from around the world to share their best tip of the year! At the end of the podcast episode, there is a great Excel song created by Clint Tuttle called "Can't Stop...
byLearn Microsoft Excel with MyExcelOnline
0 ratings
0% found this document useful
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
Podcast episode
Exploring The Evolving Role Of Data Engineers: An interview with Maxime Beauchemin about how the technological progression in the data ecosystem is driving a constant change in the role and responsibilities of data engineers.
byData Engineering Podcast
100%
100% found this document useful
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
Podcast episode
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
byData Engineering Podcast
0 ratings
0% found this document useful
81. Ace the Digital SAT: The Ultimate Guide to Scoring in the Top 1%: The new Digital SAT presents hurdles and opportunities for students aiming for top scores. On today’s episode, Jake Adams, Director of Ingenious Test Prep, unveils the strategies essential for scoring in the top percentile. Whether it's understanding...
Podcast episode
81. Ace the Digital SAT: The Ultimate Guide to Scoring in the Top 1%: The new Digital SAT presents hurdles and opportunities for students aiming for top scores. On today’s episode, Jake Adams, Director of Ingenious Test Prep, unveils the strategies essential for scoring in the top percentile. Whether it's understanding...
byInside the Admissions Office: Advice from Former Admissions Officers
0 ratings
0% found this document useful
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
Podcast episode
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
byScreaming in the Cloud
0 ratings
0% found this document useful
007: Data Cleansing & Analysis with Oz du Soleil: Oz du Soleil is an Excel MVP since 2015 and is an expert in data cleansing & analysis. He has an Excel blog over at www.datascopic.net which is his commitment to data literacy. He’s the leading author on the revised version of Guerrilla Data...
Podcast episode
007: Data Cleansing & Analysis with Oz du Soleil: Oz du Soleil is an Excel MVP since 2015 and is an expert in data cleansing & analysis. He has an Excel blog over at www.datascopic.net which is his commitment to data literacy. He’s the leading author on the revised version of Guerrilla Data...
byLearn Microsoft Excel with MyExcelOnline
0 ratings
0% found this document useful
Wrote the book on LinkedIn: Brenda Bernstein resume expert from the essayexpert.com
Podcast episode
Wrote the book on LinkedIn: Brenda Bernstein resume expert from the essayexpert.com
byThe Break - with Michael Gardon
0 ratings
0% found this document useful
The Role of Infrastructure in ML // Niels Bantilan // #197
Podcast episode
The Role of Infrastructure in ML // Niels Bantilan // #197
byMLOps.community
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Business, Management, and Marketing
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
Podcast episode
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Economics
0 ratings
0% found this document useful
Improving Upon a First-Draft Data Science Analysis: There are a lot of good resources out there for g…
Podcast episode
Improving Upon a First-Draft Data Science Analysis: There are a lot of good resources out there for g…
byLinear Digressions
0 ratings
0% found this document useful
RLHF 201 - with Nathan Lambert of AI2 and Interconnects
Podcast episode
RLHF 201 - with Nathan Lambert of AI2 and Interconnects
byLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0
0 ratings
0% found this document useful
Encore Episode: Deep Learning: Did you know that the concept of deep learning goes way back to the 1950s? However, it is only in recent years that this technology has created a tremendous amount of buzz (and for good reason!). A subset of machine learning, deep learning is inspired...
Podcast episode
Encore Episode: Deep Learning: Did you know that the concept of deep learning goes way back to the 1950s? However, it is only in recent years that this technology has created a tremendous amount of buzz (and for good reason!). A subset of machine learning, deep learning is inspired...
byOracle University Podcast
0 ratings
0% found this document useful
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Podcast episode
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
Podcast episode
4 + 1 Model of Data Science: Before diving into the complex world of data science it seemed to wise to establish a shared definition of the field. Here at the UVA School of Data Science, we have defined data science with the 4 + 1 Model. This model serves an outline for the first series of UVA Data Points. It also serves as a guiding definition within the School of Data Science, touching everything from research to course planning. In this introduction trailer, host Monica Manney discusses the history, development, and function of the 4 + 1 Model of Data Science with its main author, Raf Alvarado. Below is a brief expect from An Outline of the 4 + 1 Model of Data Science by Raf Alvarado: “The point of the 4 + 1 model, abstract as it is, is to provide a practical template for strategically planning the various elements of a school of data science. To serve as an effective template, a model must be general. But generality if often purchased at the cost of intuitive understanding. The fol
byUVA Data Points
0 ratings
0% found this document useful
Devon Estes from Sketch on Benchee, Performance and Training: Devon Estes joins our ongoing discussion about performance and training in the Elixir world, shares about his current work on the beta for Sketch Cloud, his previous Erlang consultancy role at one of the largest banks in Europe, and the massive responsibility he carried while working on the bottom line application.
Podcast episode
Devon Estes from Sketch on Benchee, Performance and Training: Devon Estes joins our ongoing discussion about performance and training in the Elixir world, shares about his current work on the beta for Sketch Cloud, his previous Erlang consultancy role at one of the largest banks in Europe, and the massive responsibility he carried while working on the bottom line application.
byElixir Wizards
0 ratings
0% found this document useful
The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155
Podcast episode
The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155
byMLOps.community
0 ratings
0% found this document useful
69: Testing Front End Code: Summary Oren Rubin (@Shexman) goes through why it’s important to not only test the back-end code of our applications but also to test our Front End code, the integration points, and the full user experience. Oren also goes through...
Podcast episode
69: Testing Front End Code: Summary Oren Rubin (@Shexman) goes through why it’s important to not only test the back-end code of our applications but also to test our Front End code, the integration points, and the full user experience. Oren also goes through...
byThe Web Platform Podcast
0 ratings
0% found this document useful
Data Shapley: We talk often about which features in a dataset a…
Podcast episode
Data Shapley: We talk often about which features in a dataset a…
byLinear Digressions
0 ratings
0% found this document useful
How Data Platforms Affect ML & AI // Jake Watson // #207
Podcast episode
How Data Platforms Affect ML & AI // Jake Watson // #207
byMLOps.community
0 ratings
0% found this document useful
Putting the Art in Artificial Intelligence with Creative Computation: A Conversation with Dr. Philippe Pasquier
Podcast episode
Putting the Art in Artificial Intelligence with Creative Computation: A Conversation with Dr. Philippe Pasquier
byThe AI in Business Podcast
0 ratings
0% found this document useful
MLOps Meetup #24 // How to Become a Better Data Scientist: The Definite Guide // Alexey Grigorev
Podcast episode
MLOps Meetup #24 // How to Become a Better Data Scientist: The Definite Guide // Alexey Grigorev
byMLOps.community
0 ratings
0% found this document useful
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones
Podcast episode
120 - Advancing Excel as a programming language with Andy Gordon and Simon Peyton Jones
byMicrosoft Research Podcast
0 ratings
0% found this document useful
The End of Finetuning — with Jeremy Howard of Fast.ai
Podcast episode
The End of Finetuning — with Jeremy Howard of Fast.ai
byLatent Space: The AI Engineer Podcast — Practitioners talking LLMs, CodeGen, Agents, Multimodality, AI UX, GPU Infra and all things Software 3.0
0 ratings
0% found this document useful
Better Training: I was talking with some data professionals recently about training and the value of college, work, or some other method of entering the technology business. I have a few thoughts about the different ways of teaching people about this business,...
Podcast episode
Better Training: I was talking with some data professionals recently about training and the value of college, work, or some other method of entering the technology business. I have a few thoughts about the different ways of teaching people about this business,...
byVoice of the DBA
0 ratings
0% found this document useful
Let's Continue Bundling into the Database // Ethan Rosenthal // MLOps Coffee Sessions #131
Podcast episode
Let's Continue Bundling into the Database // Ethan Rosenthal // MLOps Coffee Sessions #131
byMLOps.community
0 ratings
0% found this document useful

Skip carousel

“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
PC Pro Magazine
Article
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
Oct 8, 2022
9 min read
Set Up Your First Database
Linux Format
Article
Set Up Your First Database
Aug 25, 2020
1 min read
Amazon’s Retreat And The New Politics Of Tech
The Atlantic
Article
Amazon’s Retreat And The New Politics Of Tech
Feb 14, 2019
2 min read
How Two Georgia Tech Students Came Up With The Common App For Internships
NPR
Article
How Two Georgia Tech Students Came Up With The Common App For Internships
Apr 4, 2017
3 min read
Precision Medicine Is Crushing Once-Untreatable Cancers
Newsweek
Article
Precision Medicine Is Crushing Once-Untreatable Cancers
Jul 26, 2019
12 min read
Secrets Of An Excel Esports Player: How Pros Tap The True Power Of Spreadsheets
PCWorld
Article
Secrets Of An Excel Esports Player: How Pros Tap The True Power Of Spreadsheets
Mar 8, 2022
6 min read
GENEALOGY GADGETS & APPS FOR ALL OCCASIONS!
Family Tree UK
Article
GENEALOGY GADGETS & APPS FOR ALL OCCASIONS!
Dec 9, 2022
4 min read
Top 10 Excel Functions That Everyone Should Know
Techfastly
Article
Top 10 Excel Functions That Everyone Should Know
Feb 4, 2021
5 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Finding A New Career In AI
APC
Article
Finding A New Career In AI
Mar 23, 2020
4 min read
Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
Article
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
Article
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Note-taking Applications For Family History
Family Tree UK
Article
Note-taking Applications For Family History
Mar 10, 2023
7 min read
Clueless About How To Write Excel Formulas? Use AI
PCWorld
Article
Clueless About How To Write Excel Formulas? Use AI
Mar 7, 2023
2 min read
Family History In The AI Era
Family Tree UK
Article
Family History In The AI Era
Apr 12, 2024
7 min read
Machine-learning On Your Android Phone?
APC
Article
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read
2 The Use of Python in AI and ML
Techfastly
Article
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
Google Answer Box Strategy
Techfastly
Article
Google Answer Box Strategy
Sep 21, 2020
Leveraging the Google PAA (People Also Ask) element on a Search Results Page for Targeted Content Creation with a Python Scraper All businesses that are online today are creating content at a furious pace. According to Technavio, a research firm, con
7 min read
ChatGPT Is a Mirror of Our Times
Nautilus
Article
ChatGPT Is a Mirror of Our Times
Jan 17, 2023
10 min read
SYNC OR SWIM Trello
Screen Education
Article
SYNC OR SWIM Trello
Sep 15, 2019
8 min read
Copilot Pro For Excel
PC Pro Magazine
Article
Copilot Pro For Excel
Mar 7, 2024
Unlike the other Copilot Pro tools, Copilot for Excel is labelled prominently as “beta”. But even in this qualified state, it has the promise of being a game-changer for anyone who needs to work with data but doesn’t want to become an expert in writi
2 min read
Covid: How Excel May Have Caused Loss Of 16,000 Test Results In England
The Guardian
Article
Covid: How Excel May Have Caused Loss Of 16,000 Test Results In England
Oct 5, 2020
2 min read
The Race To Exascale Supercomputers
Maximum PC
Article
The Race To Exascale Supercomputers
Jun 21, 2022
9 min read
Chinese Students' Dream Device Defeats Japan's Most Powerful Supercomputer In World Contest
Post Magazine
Article
Chinese Students' Dream Device Defeats Japan's Most Powerful Supercomputer In World Contest
Jun 15, 2022
A small computer developed by Chinese students outperformed Japan's most powerful machine in solving a major complex data problem related to artificial intelligence, according to the latest global ranking. Supercomputer Fugaku in Japan has nearly 4 m
3 min read
“You Don’t Need A Computer, Let Alone One With 75,000 Processor Cores, To Think About The Parts Of A Problem”
PC Pro Magazine
Article
“You Don’t Need A Computer, Let Alone One With 75,000 Processor Cores, To Think About The Parts Of A Problem”
Dec 10, 2020
9 min read
A.i. Coding
Linux Format
Article
A.i. Coding
Aug 22, 2023
16 min read
This PC Does Not Exist
Maximum PC
Article
This PC Does Not Exist
May 23, 2023
7 min read

Related categories

Skip carousel

Reviews for Learn Data Mining Through Excel

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Learn Data Mining Through Excel - Hong Zhou

H. ZhouLearn Data Mining Through Excelhttps://doi.org/10.1007/978-1-4842-5982-5_1

1. Excel and Data Mining

Hong Zhou¹

(1)

University of Saint Joseph, West Hartford, CT, USA

Let’s get right to the topic. Why do we need to learn Excel in our data mining endeavor? It is true that there are quite a few outstanding data mining software tools such as RapidMiner and Tableau that make the mining process easy and straightforward. In addition, programming languages Python and R have a large number of reliable packages dedicated to various data mining tasks. What is the purpose of studying data mining or machine learning through Excel?

Why Excel?

If you are already an experienced data mining professional, I would say that you are asking the right question and probably you should not read this book. However, if you are a beginner in data mining, or a visual learner, or want to understand the mathematical background behind some popular data mining techniques, or an educator, then this book is right for you, and probably is the first book you should read before you start your data mining journey.

Excel allows you to work with data in a transparent manner, meaning when an Excel file is opened, the data is visible immediately and every step of data processing is also visible. Intermediate results are contained in the Excel worksheet and can be examined while you are conducting your mining task. This allows you to obtain a deep and clear understanding of how the data are manipulated and how the results are obtained. Other software tools and programming languages hide critical aspects of the model construction process. For most data mining projects, the goal is to find the internal hidden patterns inside the data. Therefore, hiding the detailed process is beneficial to the users of the tools or packages. But it is not helpful for beginners, visual learners, or those who want to understand how the mining process works. Let me use k-nearest neighbors method (K-NN) to illustrate the learning differences between RapidMiner, R, and Excel. Before we do that, we need to understand several terminologies in data mining.

There are two types of data mining techniques: supervised and unsupervised. Supervised methods require the use of a training dataset to train the software programs or algorithms (such programs or algorithms are often referred to as machines) first. Programs are trained to reach an optimal state called a model. This is why a training process is also called modeling. Data mining methods can also be categorized into parametric and nonparametric methods. For parametric methods, a model is just a set of parameters or rules obtained through the training process that are believed to allow the programs to work well with the training dataset. Nonparametric methods do not generate a set of parameters. Instead, they dynamically evaluate the incoming data based on the existing dataset. You may be confused by such definitions at this time. They will make sense soon.

What is a training dataset? In a training dataset, the target variable (also called label, target, dependent variable, outcome variable, response), the value of which to be predicted, is given or known. The value of the target variable depends on the values of other variables which are usually called attributes, predictors, or independent variables. Based on the attribute values, a supervised data mining method computes (or so-called predicts) the value of the target variable. Some computed target values might not match the known target values in the training dataset. A good model indicates an optimal set of parameters or rules that can minimize the mismatches.

A model is usually constructed to work on future datasets with unknown target values in a supervised data mining method. Such future datasets are commonly called scoring datasets. In an unsupervised data mining method , however, there is no training dataset and the model is an algorithm that can directly be applied on the scoring datasets. K-nearest neighbors method is a supervised data mining technique.

Suppose we want to predict if a person is likely to accept a credit card offer based on the person’s age, gender, income, and number of credit cards they already have. The target variable is the response to the credit card offer (assume it is either Yes or No), while age, gender, income, and number of existing credit cards are the attributes. In the training dataset, all variables including both the target and attributes are known. In such a scenario, a K-NN model is constructed through the use of the training dataset. Based on the constructed model, we can predict the responses to the credit card offer of people whose information is stored in the scoring dataset.

In RapidMiner, one of the best data mining tools, the prediction process is as follows: retrieve both the training data and scoring data from the repository ➤ set role for the training data ➤ apply the K-NN operator on the training data to construct the model ➤ connect the model and the scoring data to the Apply Model operator. That’s it! You can now execute the process and the result is obtained. Yes, very straightforward. This is shown in Figure 1-1. Be aware that there is no model validation in this simple process.

../images/494428_1_En_1_Chapter/494428_1_En_1_Fig1_HTML.jpg

Figure 1-1

K-NN model in RapidMiner

Applying K-NN method is very simple in R, too. After loading the library class, read the training data and scoring data and make use of the K-NN function, and by then we have finished our job: ready to view our result. This is demonstrated in Figure 1-2. Note that lines starting with # are comments.

../images/494428_1_En_1_Chapter/494428_1_En_1_Fig2_HTML.jpg

Figure 1-2

K-NN in R

The knowledge you have gained from the preceding tasks is enough to just be able to apply the data mining method K-NN. But if you are trying to understand, step by step, why and how K-NN works, you will need a lot more information. Excel can offer you the opportunity to go through a step-by-step analysis process on a dataset during which you can develop a solid understanding of the K-NN algorithm. With this solid understanding, you can then be more proficient in using other powerful tools or programming languages. Most importantly, you will have a better understanding of the quality and value of your data mining results. You will see that in later chapters.

Of course, Excel is much more limited in data mining compared to R, Python, and RapidMiner. Excel can only work with data up to a smaller size limit. Meanwhile, some data mining techniques are too complicated to be practiced through Excel. Nonetheless, Excel provides us direct and visual understanding of the data mining mechanisms. In addition, Excel is naturally suitable for data preparation.

Today, because of the software tools and other packages, most effort in a data mining task is spent on understanding the task (including the business understanding and data understanding), preparing the data, and presenting the results. Less than 10% of the effort is spent on the modeling process. The process of preparing the data for modeling is called data engineering . Excel has an advantage on data engineering when the datasets are not too large because it can give us a visual representation of data engineering, which allows us to be more confident in our data preparation process.

As an experienced educator , I realize that students can better develop a deep understanding of data mining methods if these methods are also explained through step-by-step instructions in Excel. Studying through Excel unveils the mystery behind data mining or machine learning methods and makes students more confident in applying these methods.

Did I just mention machine learning? Yes, I did. Machine learning is another buzz phrase today. What is machine learning ? What is the difference between data mining and machine learning?

Most efforts to differentiate data mining and machine learning are not successful because data mining and machine learning cannot be clearly separated per se. At this moment, I would suggest that we treat them the same. But if I must tell the difference between data mining and machine learning, I would say that machine learning is more on supervised methods, while data mining includes both supervised and unsupervised methods.

Prepare Some Excel Skills

There are quite some Excel skills to learn in this book. I will explain some of them in detail when we need to use them. However, there are several fundamental Excel skills and functions that we need to be familiar with before we start talking about data mining.

Formula

Formula is the most important feature of Excel. Writing a formula is like writing a programming statement. In Excel, a formula always starts with an equal sign (= without quotation marks).

Upon opening an Excel file, we are greeted with a table-like worksheet. Yes, every worksheet is a huge table. One reason why Excel is naturally suitable for data storage, analysis, and mining is because data are automatically arranged in a table format in Excel. Each cell in the big table has a name or so-called reference. By default, each column is labeled by an alphabet, while each row is labeled with a number. For example, the very first cell at the top-left corner is cell A1, that is, column A and row 1. The content in a cell, whatever it is, is represented by the cell reference.

Enter number 1 in cell A1. The value of cell A1 is 1 and A1 represents 1 at this moment.

Enter the formula =A1*10 (without the double quotation marks) in cell B1 and hit the Enter key. Note that the formula starts with =. Be aware that this is the only time a formula is presented inside a pair of double quotation marks in this book. From now on, all formulas are presented directly without quotation marks.

Enter the text A1 * 10 in cell C1. Because the text does not start with =, it is not a formula.

Our worksheet looks like Figure 1-3.

../images/494428_1_En_1_Chapter/494428_1_En_1_Fig3_HTML.jpg

Figure 1-3

Excel formula

Autofill or Copy

Autofill is another critical feature of Excel which makes Excel capable of working with a relatively large dataset. Autofill is also called copy by many people.

Let’s learn autofill by the following experiment:

Enter 1 in cell A1.

Enter 2 in cell A2.

Select both cells A1 and A2.

Release the left mouse button.

Move the mouse cursor to the right-below corner of cell A2 until the cursor becomes a black cross (shown in Figure 1-4).

../images/494428_1_En_1_Chapter/494428_1_En_1_Fig4_HTML.jpg

Figure 1-4

Cursor becomes a black cross

Press down the left mouse button and drag down to cell A6.

The cells A1:A6 are automatically filled with numbers 1, 2, 3, 4, 5, and 6. This process is called autofill. Some people call it copy, too. But more precisely, this process is autofill.

Let’s conduct another experiment:

Select cell B1 (make sure that B1 still has the formula =A1*10). Lift up the left mouse button.

Move the mouse cursor to the left-down corner of cell B1 until the cursor becomes a black cross.

Drag down the mouse cursor to cell B6. Our worksheet looks like

Enjoying the preview?

Page 1 of 1

Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods

About this ebook

Hong Zhou

Related authors

Related to Learn Data Mining Through Excel

Related ebooks

Programming For You

Related podcast episodes

Related articles

Related categories

Reviews for Learn Data Mining Through Excel

What did you think?

Book preview

Learn Data Mining Through Excel - Hong Zhou

1. Excel and Data Mining

Why Excel?

Prepare Some Excel Skills

Formula

Autofill or Copy