Predictive Analytics Using R

Paper Code: 
MBB 323
Credits: 
4
Contact Hours: 
90.00
Max. Marks: 
100.00
Objective: 

Course Objectives: The objective of this course is to enable students to 

  • Gain knowledge about descriptive and predictive analysis using R 
  •  Apply analysis techniques in different business cases using R libraries.

 

Course Outcomes: 

Course outcome (at course level)

Learning and teaching strategies

Assessment Strategies 

CO 101. Install R and run commands and scripts in Rstudio environment for business analytics

CO 102. Apply descriptive and inferential statistics on business problems using R

CO 103. Generate charts and plots for analysis in R environment and interpret results.

CO 104. Design and Analyze regression model for different business problem using R.

CO 105. Evaluate the performance of regression model.

 

Approach in teaching:

Interactive Lectures, Group Discussion, Tutorials, Case Study, Practical demonstration 

 

Learning activities for the students:

Self-learning assignments, presentations, R exercises 

Class test, Semester end examinations, Quiz, Practical Assignments,

Presentation

 

18.00
Unit I: 
Introduction to R Programming

R and R Studio, Logical Arguments, Missing Values, Characters, Factors and Numeric, Help in R, Vector to Matrix, Matrix Access, Data Frames, Data Frame Access, Basic Data Manipulation Techniques, Usage of various apply functions – apply, lapply, sapply and tapply, Outliers treatment.                                                              

18.00
Unit II: 
Descriptive Statistics

Measures of Central Tendency (Mean, Mode and Median), Charts (Bar, Pie and Box Plot, Histogram, Stem and Leaf Diagram), Measures of dispersion (Range, Inter-Quartile-Range, Standard Deviation, Skewness and Kurtosis), Standard Error of Mean and Confidence Intervals.

Discrete Probability Distributions: Binomial, Poisson, Continuous Probability Distribution, Normal Distribution & t-distribution, Sampling Distribution and Central Li

18.00
Unit III: 
Statistical Inference and Hypothesis Testing

Parametric and non parametric tests (one sample, independent sample, paired sample and two and more then two samples)

18.00
Unit IV: 
Correlation and Regression

Analysis of Relationship, Positive and Negative Correlation, Perfect Correlation, Correlation Matrix, Scatter Plots, Simple Linear Regression, R Square, Adjusted R Square, Testing of Slope, Standard Error of Estimate, Overall Model Fitness, Assumptions of Linear Regression, Multiple Regression, Coefficients of Partial Determination, Durbin Watson Statistics, Variance Inflation Factor.

18.00
Unit V: 
Logistic Regression

Binary Classification versus Point Estimation, Odds versus Probability, Logit Function, Classification Matrix, Individual Group Classification Efficiency, Overall Classification Efficiency, Nagelkerke R Square, Receiver Operating Characteristic Curve, Sensitivity, Specificity, Area Under ROC Curve, Cut-Offs, True Positive Rate and False Positive Rate.

Essential Readings: 
  • Maindonald,John,Braun john ,”Data Analysis and Graphics Using R”, Cambridge University Press,2007
  • Gardener Mark,”Beginning R: The Statistical Programming Language “ Wiley India Pvt. Ltd. 2015
  • Srivasa K.G., Siddesh G M,Shetty,” Statistical Programming in R”, Oxford University Press 2017
  • Business Statistics: Naval Bajpai, Pearson
  • Menard, S. (2002). Applied Logistic Regression Analysis. Thousand Oaks, CA: Sage.

 

Academic Year: