Site icon Intone Networks

Introduction to machine learning using python

As the title suggests, this article aims  at the newbie developers interested to be a part of this digital revolution, Data Science, which possess minimal knowledge of machine learning and Python.

What is Machine Learning?

Machine learning is a field of computer science that often uses statistical techniques to give computers the ability to “learn” with data, without being explicitly programmed. It’s an application of Artificial Intelligence (AI). Practically, it means, we need to feed data into an algorithm, and use it to make predictions about what might happen in the future.

Check – Big data analytics in the manufacturing industry: How can big data benefit the manufacturing industry.

In 1997, Tom Mitchell gave a “well-posed” definition that has been proven to be more useful for the engineering types: “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”

So if you want your program to predict, for example, traffic patterns at a busy intersection (task T), you can run it through a machine learning algorithm with data about past traffic patterns (experience E) and, if it has successfully “learned”, it will then do better at predicting future traffic patterns (performance measure P).

Among the different types of ML tasks, a crucial distinction is drawn between supervised and unsupervised learning:

 

There is a really vast range of applications which involves domains such as,

  1. Healthcare (e.g., personalized treatments and medications, drug manufacturing) Check – cloud computing in the pharmaceutical industry.
  2. Finance (e.g., fraud detection)
  3. Retail (e.g., product recommendations, improved customer service)
  4. Travel (e.g., dynamic pricing like, how does Uber determine the price of your ride, and sentimental analysis, like, TripAdvisor collects information of the travellers from social media when we share photos and reviews, and tries on improvising its service based on the reviews)
  5. Media(e.g., Facebook, from personalizing news feed to rendering targeted ads, machine learning is the heart of all social media platforms for their own and user benefits.)

On the other hand, unlike Ruby, Python is a complete language and platform that one can use for research and developing production systems. It can feel overwhelming to choose from multiple libraries and modules.

 

So, let’s start with the step by step procedure to be followed by a beginner to start machine learning using Python.

 

Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming language. It was created by Guido van Rossum during 1985- 1990. The python source code is available under the GNU General Public License (GPL).

 

You can follow the following sources to leverage your Python skills:

Google’s Python Class

Google Developer Python Course

 

Python has an amazing ecosystem of libraries that make machine learning easy to get started with.  It’s is one of the most popular and in-demand languages in the job market, today. This is why; we can get plenty of resources online to learn. Learners will find hardly any difficulty.

 

https://docs.anaconda.com/anaconda/install/

Follow the instructions and procedure for the installation stated in the site. The Anaconda package contains the required package to explore machine learning.

 

If you want to have an overall idea about Machine learning, from scratch, you might want to follow this crash course by Google:

Machine Learning Crash Course

Andrew Ng’s Machine Learning course is also a great option for learners.

Machine Learning – Offered by the Stanford [by Andrew Ng]

 

Once we are comfortable with Python and Machine Learning, we shall shift to Python libraries.

 

  1. Pandas: Our first step is to read in the data and bring out some relevant and quick summary statistics, for which we shall use the Pandas library. Pandas provide data structures and data analysis tool that make manipulating data in Python much quicker and effective.
    We’ll read in our data from a CSV file into a Pandas dataframe, using the read_csv
  2. NumPy: The most common data structure is called a dataframe. A dataframe is an extension of a matrix.
    A matrix is a two-dimensional data structure, with rows and columns. Matrices in Python can be used via the NumPy As in case of matrices, we can’t easily access columns and rows by name, and each column has to have the same datatype, hence, we use Dataframes, which can have different datatypes in each column. It has a lot of built-in features for analyzing data.
  3. Matplotlib: It is the main plotting infrastructure in Python, and most other plotting libraries, like seaborn and ggplot2, are built on top of Matplotlib. We import Matplotlib’s plotting functions with import matplotlib.pyplot as plt. We can then draw and show plots.
  4. Scikit-learn: The library is built upon the SciPy (Scientific Python) that must be installed before you can use scikit-learn. This stack  includes:

Extensions or modules for SciPy care conventionally named SciKits. As such, the module provides learning algorithms and is named scikit-learn.

 

Now, as you have the grip of the basics of Python and its libraries and Machine learning algorithms, it’s always  best to start with a small project. Here are the steps on how to start with the project:

  1. Define a Problem
  2. Prepare the Data
  3. Evaluate the Algorithms
  4. Improve the Results
  5. Present the Results

 

To start with Machine Learning using Python, after the above-given step of installing Anaconda, first check the version of python you are using and then,

 

 

Exit mobile version