Course outline for Data Science and Machine Learning
Pre-requisites for learning Data Science and Machine Learning
- The participants must be comfortable with programming constructs in Python
- Basic algebraic concepts is a pre-requisite, knowledge of statistics is preferable
Lab Setup
- Hardware Configuration
- All participants must have a laptop with Internet connectivity
- At least 4GB of RAM and 20GB of free hard-disk space
- Software Configuration
- Install Python and R before the start of the training
Duration
- 5 Days
Training Mode
Online training for Data Science and Machine Learning
We provide:
- Instructor led live training
- Self-paced learning with access to expert coaches
- 24x7 access to cloud labs with end to end working examples
All jnaapti sessions are 100% hands-on. All our instructors are engineers by heart. Activities are derived from real-life problems faced by our expert faculty. Self-paced hands-on sessions are delivered via Virtual Coach.
Classroom training for Data Science and Machine Learning
Classroom sessions are conducted in client locations in:
- Bengaluru
- Chennai
- Hyderabad
- Mumbai
- Delhi/Gurgaon/NCR
Note: Classroom training is for corporate clients only
Detailed Course Outline for Data Science and Machine Learning
Some Pre-requisites
- Linear Algebra
- Probability
- Probability Distribution
Overview of Data Science and Machine Learning
- What is Data Science and Data Analysis?
- Cleaning up the data before analysis
- Data Visualization
- What is Machine Learning?
- Supervised v/s unsupervised learning
- Aritificial Intelligence
- Deep Learning
Some Basic Machine Learning Concepts
- Features, Labels and Classifiers
- Supervised Learning Algorithms
- Naive Bayes
- Decision Trees
- Support Vector Machines
- Kernel Trick
- Principal Component Analysis
- Unsupervised Learning
- Clustering algorithms
- K-means
- Neural Networks
Applications of Machine Learning
- Spam Detection
- Recommendation System
- Handwriting Recognition
- Face Recognition
Natural Language Processing
- Working with unstructured data
- Working with HTML data, scraping
- Token
- Introduction to BNF
- Regular Expression concepts
Technologies used in Machine Learning
- Octave
- Matlab
- Python
- R
- Julia
Python - Machine Learning Libraries
- numpy
- sklearn
- scipy
- textblob
- nltk
- textblob
- matplotlib
- pandas
- Jupyter
- Tensorflow
- Keras
Overview of sklearn
- Naive Bayes
- Decision Trees
- Support Vector Machines
- Kernel Trick
- Principal Component Analysis
- Clustering algorithms
- K-means
- Neural Networks
- Persisting the models using pickle
Overview of Pandas
- Data Structures - Series, Data Frames
- Importing, Analysing and Exporting data with Pandas
- Descriptive Stastics
- Aggregation APIs
- Transform APIs
- Iteration on data structures
- Working with text data
Introduction to nltk and textblob
- Tokenizer
- POS tagger
- Text classification
- Vectorizer
- tf-idf
- Sentiment Analysis with textblob - a case study
Introduction to R
- Installing R and R Studio
- Atomic Data Types
- Control Structures in R
- Functions in R
- Vectors and Lists
- Matrices
- Data Frames
- Operations on Vectors, Lists, Matrices and Data Frames
- Loading, analysing and storing your data
- Basic statistical functions in R
- Data Transforms in R using dplyr
- Machine Learning in R
- Plotting your analysis
Neural Networks
- Perceptrons
- Sigmoid Neurons
- Gradient Descent
- Back propagation
- Introduction to Deep Learning
Jupyter
- Installing Jupyter Notebook
- Running a Notebook server
- Notebook basics
- Using Jupyter with R (R Kernel)
- Securing your Jupyter notebook
Data Visualization
- Need for visualization
- Distribution of one variable - Histogram, Density plot
- Distribution of multiple variables - Heat map, Surface plots
- Distribution summary using Box plot, Violin plot
- Visualization libraries in Python
- matplotlib
- Seaborn
- ggplot and ggplot2
- Altair
- Visualization with R