Press enter to see results or esc to cancel.

Data Mining and Predictive Analytics Learning Plan

In the last post, I mentioned that I would be putting together a learning plan to facilitate a shift in focus from SSAS into the realm of data mining and predictive analytics. Clearly this is not going to be a quick and easy journey – but I think it is going to be a rewarding one and an important move to make. So, without further adieu, below is my tentative short-term learning plan for building the necessary foundation:

  • Join local R user group
  • Online Classes
    • Udacity
      (~200 hrs or 20 weeks @ 10 hrs/wk)

      • Intro to Statistics (~50 hrs)
      • Intro to Data Science (~50 hrs)
      • Machine Learning 1 – Supervised Learning (~50 hrs)
      • Machine Learning 2 – Unsupervised Learning (~25 hrs)
      • Machine Learning 3 – Reinforcement Learning (~25 hrs)
    • Coursera – Data Science Specialization
      (~110 hrs or 11 weeks @ 10 hrs/wk)

      • The Data Scientist’s Toolbox (~12 hrs)
      • R Programming (~12 hrs)
      • Getting and Cleaning Data (~12 hrs)
      • Exploratory Data Analysis (~12 hrs)
      • Reproducible Research (~12 hrs)
      • Statistical Inference (~12 hrs)
      • Regression Models (~12 hrs)
      • Practical Machine Learning (~12 hrs)
      • Developing Data Products (~12 hrs)
  • Books

I’m sure there will be some minor adjustments here and there but I think it’s a pretty solid start. If you have any suggestions or adjustments, please reach out 🙂

Note: to be clear, I’m certainly not abandoning my deep dive into SSAS and the semantic layer. I love SSAS and think it is a beautiful product that is hear to stay for quite a while. However, one cannot ignore Microsoft’s “cloud first” development strategy which so far has not included SSAS. So I see this as a nice opportunity to branch out into another area until progress resumes.

The other clarification I’d like to make is that I’m not expecting to come out of this endeavor as a professional statistician. The goal is simply to become a valuable resource for data mining and predictive analytics focused projects. To be able to communicate effectively with statisticians and implement solutions in conjunction w/ input from said statisticians and subject matter experts.

Update: Andy brings up a great point in the comments about calculating required hours and duration…so I’ve added hours to the courses based on the estimates from overview page for each class. I’m also expecting a fair amount of sidetracks when I feel a deeper dive is necessary – but this will likely come further along once I start trying to build things 🙂