R programming Language is one the most ground-breaking apparatus for computational insights, representation and information science. Information researchers and analysts utilize R for tackling numerous perplexing issues in their industry. R is widely utilized in organizations like Bing, Google, Facebook, Twitter and Uber. As R is utilized in different areas like Social media organizations, Banks, Insurance organizations, Car producers, R is a standout amongst the most looked for information investigation ability that is popular. R Programming is an intense measurable the programming dialect which is utilized for prescient displaying and other information mining related procedures. R programming can be utilized for information control, information collection, measurable Modeling, Creating outlines and plots. R writing computer programs is turning into the most fundamental ability in the field of examination for its open source validity.

**COURSE CONTENT:-**

**Introduction to R**

R language for statistical programming, the various features of R, introduction to R Studio, the statistical packages, familiarity with different data types and functions, learning to deploy them in various scenarios, use SQL to apply ‘join’ function, components of R Studio like code editor, visualization and debugging tools, learn about R-bind.

**R-Packages**

R Functions, code compilation and data in well-defined format called R-Packages, learn about R-Package structure, Package metadata and testing, CRAN (Comprehensive R Archive Network), Vector creation and variables values assignment.

**Sorting Dataframe**

R functionality, Rep Function, generating Repeats, Sorting and generating Factor Levels, Transpose and Stack Function.

**Matrices and Vectors**

Introduction to matrix and vector in R, understanding the various functions like Merge, Strsplit, Matrix manipulation, rowSums, rowMeans, colMeans, colSums, sequencing, repetition, indexing and other functions.

**Reading data from external files**

Understanding subscripts in plots in R, how to obtain parts of vectors, using subscripts with arrays, as logical variables, with lists, understanding how to read data from external files.

**Generating plots**

Generate plot in R, Graphs, Bar Plots, Line Plots, Histogram, components of Pie Chart.

**Analysis of Variance (ANOVA)**

Understanding **Analysis of Variance** (ANOVA) statistical technique, working with Pie Charts, Histograms, deploying ANOVA with R, one way ANOVA, two way ANOVA.

**K-means Clustering**

K-Means Clustering for Cluster & Affinity Analysis, Cluster Algorithm, cohesive subset of items, solving clustering issues, working with large datasets, association rule mining affinity analysis for data mining and analysis and learning co-occurrence relationships.

**Association Rule Mining**

Introduction to Association Rule Mining, the various concepts of Association Rule Mining, various methods to predict relations between variables in large datasets, the algorithm and rules of Association Rule Mining, understanding single cardinality.

**Regression in R**

Understanding what is Simple Linear Regression, the various equations of Line, Slope, Y-Intercept Regression Line, deploying analysis using Regression, the least square criterion, interpreting the results, standard error to estimate and measure of variation.

**Analyzing Relationship with Regression**

Scatter Plots, Two variable Relationship, Simple Linear Regression analysis, Line of best fit

**Advance Regression**

Deep understanding of the measure of variation, the concept of co-efficient of determination, F-Test, the test statistic with an F-distribution, advanced regression in R, prediction linear regression.

**Logistic Regression**

Logistic Regression Mean, Logistic Regression in R.

**Advance Logistic Regression**

Advanced logistic regression, understanding how to do prediction using logistic regression, ensuring the model is accurate, understanding sensitivity and specificity, confusion matrix, what is ROC, a graphical plot illustrating binary classifier system, ROC curve in R for determining sensitivity/specificity trade-offs for a binary classifier.

**Receiver Operating Characteristic (ROC)**

Detailed understanding of ROC, area under ROC Curve, converting the variable, data set partitioning, understanding how to check for multicollinearlity, how two or more variables are highly correlated, building of model, advanced data set partitioning, interpreting of the output, predicting the output, detailed confusion matrix, deploying the Hosmer-Lemeshow test for checking whether the observed event rates match the expected event rates.

**Kolmogorov Smirnov Chart**

Data analysis with R, understanding the WALD test, MC Fadden’s pseudo R-squared, the significance of the area under ROC Curve, Kolmogorov Smirnov Chart which is non-parametric test of one dimensional probability distribution.

**Database connectivity with R**

Connecting to various databases from the R environment, deploying the ODBC tables for reading the data, visualization of the performance of the algorithm using Confusion Matrix.

**Integrating R with Hadoop**

Creating an integrated environment for deploying R on Hadoop platform, working with R Hadoop, RMR package and R Hadoop Integrated Programming Environment, R programming for MapReduce jobs and Hadoop execution.

**GPS INFOTECH (Software Solutions)**

**Url: https://www.gpsinfotech.com**

**Contact person: prakash**

**Num: 919395190232 / 9989787231 with Whatsapp **

**Main mail id : gpsinfotech.net@gmail.com , prakash_m@gpsinfotech.com**