Overview
According to ABC news, China has the highest concentration of hackers in the world. Having the ability to discover patterns in data may prevent future attacks and provide the capability to discover attacker affiliations. K-means is an unsupervised learning method that is used to find similarities in unlabeled data. Its goal is to group data based on commonalities in data structure. The resulting groups (clusters) can then be applied to others models to make predictions. In this lab, we are going illustrate how to classify data using k-means.
outcomes
In this lab, you will learn to:
- Understand what k-means is used for.
- Understand how to build a model using k-means.
- Understand how to evaluate k-means modeled results.
Courses
Key terms and descriptions
pandas
Pandas is an open-source data analysis tool that runs on Python
matplotlib
Matplotlib is a 2-D plotting Python library.
sklearn
Sklearn module is a part of the scikit-learn library to allow you to do machine learning in Python.