Step 1: Graduate from a top tier university in
a quantitative discipline
. Education makes a huge difference in your prospects to start
in this industry. Most of the companies who do fresher hiring, pick out people
from best colleges directly. So, by entering into a top tier university, you
give yourself a very strong chance to enter data science world.
Ideally I would take up Computer Science as the subject of
study. If I didn’t get a seat in Computer Science batch, I’ll take up a subject
which has close ties with computational field - e.g. computational
nueroscience, Computational Fluid Dynamics etc.
Step 2: Take up a lot of MOOCs on the subject – but do them one
at a time
This is probably the biggest change, which would happen in the
journey, if I was passing out now. If you spend even a year studying the
subject by participating in these open courses, you will be in far better shape
vs. other people vying to enter the industry. It took me 5+ years of experience
to relate to the power R or Python bring to the table. You can do this today by
various courses running on various platforms.
One word of caution here is to be selective on the courses you
choose. I would focus on learning one stack – R or Python. I would recommend
Python over R today – but that is a personal choice. You can find my detailed
views about how the eco-systems compare here.
You can choose your path – but this is probably what I would do:
- Python:
- Introduction
      to Computer Science and Programming using Python – eDX.org
- Intro
      to Data Science – Udacity
- Workshop
      videos from Pycon and SciPy – some of them are mentioned here
- Selectively
      pick from the vast tutorials available on the net in form of iPython
      notebooks
- R:
- The
      Analytics Edge – eDX.org
- Pick
      out a few courses from Data Science specialization to complement
      Analytics Edge
- Other
     courses (applicable for both the stacks):
- Machine
      Learning from Andrew Ng – Coursera
- Statistics
      course on Udacity
- Introduction
      to Hadoop and MapReduce on Udacity
Step 3: Take a couple of internships / freelancing jobs
This is to get some real world experience before you actually
venture out. This should also provide you an understanding of the work which happens
in the real world. You would get a lot of exposure to real world challenges on
data collection and cleaning here. 
Step 4: Participate in data science competitions
You should aim to get at least a top 10% finish on Kaggle before
you are out of your university. This should bring you in eyes of the recruiters
quickly and would give you a strong launchpad. Beware, this sounds lot easier
than what it actually is. It can take multiple competitions for even the
smartest people to make it to the top 10% on Kaggle.
Here is an additional tip to amplify the results from your
efforts – share your work on Github. You don’t know which employer might find
you from your work!
Step 5: Take up the right job which provides awesome experience
I would take up a job in a start-up, which is doing awesome work
in analytics / machine learning. The amount of learning you can gain for the
slight risk can be amazing. There are start-ups working on deep learning,
re-inforcement learning – choose the one which fits you right (taking culture
into account)
If you are not the start-up kinds, join a analytics consultancy,
which works on tools and problems across the spectrum. Ask for projects in
different domains, work on different algorithms, try out new approaches. If you
can’t find a role in a consultancy – take up a role in captive units, but seek
a role change every 12 – 18 months. Again this is a general guideline – adapt
it depending on the learning you are having in the role.
Finally a few bonus tips:
- Try
     learning new tools once you are comfortable with ones you are already
     using. Different tools are good for different types of problem solving.
     For e.g. Learning Vowpal Wabbit can add significant advantage to your
     Python coding.
- You
     can try a shot at creating a few web apps – this adds significant
     knowledge about data flow on the web and I personally enjoy satisfying the
     hacker in me at times!
Few modifications to these tips, in case you are already out of
college or hold work experience:
- In
     case you can still go back to college, consider getting a Masters or a
     Ph.D. Nothing beats the improvement in probability of getting the right
     job compared to undergoing a good programme from top notch University.
- In
     case full time education is not possible, take up a part time programme
     from a good institute / University. But be prepared to put in extra
     efforts outside these certifications / programmes.
- If
     you are already in a job and your company has an advanced analytics setup,
     try to get an internal shift by demonstrating your learning.
- I
     have kept the focus on R or Python, because they are open source in
     nature. If you have resources to get access to SAS – you can also get a
     SAS certification for predictive modeler. Remember, SAS still holds the
     majority of jobs in analytics!
 
No comments:
Post a Comment