Introduction to Statistics & Machine-learning with R



  • Introduction to R language, community and package
  • Data analysis fundamentals: Mathematical Expectations, Standard Deviation, Estimation, Hypothesis Testing, Liner Regression, Central Limit Theorem
  • Probability fundamentals: Frequency and Histogram, Binomial Distribution, Normal Distribution, Poisson Distribution, Exponential Distribution, Gamma Distribution
  • Using "rattle" package for data mining


  • Marketing concept example: churn analysis on AT&T
  • Binary classification for customer retention 
  • R packages for supervised classification: rpart, e1071, randomForest, nnet, xgboost
  • Prediction result evaluation: confusion matrix, training-testing regime, cross validation and ROC curve


Target Participants:

IT professionals, data analysts or data science passionate learners with solid programming background and would like to have a refresh view on emerging programming languages in big data domain. Students with science background or good at mathematics are encouraged for this introductory course.



  • The course will be conducted in Cantonese with course materials in English
  • A certificate will be presented to the students with 100% attendance rate
  • The students are required to bring along with their own notebooks for the classwork sessions
R 02.jpg
R 03.jpg
R 04.jpg