Data Science

Learnins Python

Download PDF

[Cross-posted from ICanHazDataScience]

Bad news.  You’re probably going to have to learn to code.   Whilst you can go a very long way with the tools available online, at some point you’re going to have that “if I could just reformat that column and extract this information out of it” moment.  Which, generally, either means coding or finding a coder happy to help you with the task (hackathons like RHOK are good places, and always looking for good problem statements; there are also many coding-for-good groups around that might help too).

Not so bad news if you’re up for writing your own code. There is *lots* of help available online.  The language you choose is up to you… many social-good systems are written in PHP, for example, many open data systems are in Python (and there are a lot of good data-wrangling libraries available in Python and many data science courses use it as their default language), and R (free) and Matlab (not so free) are good for handling large arrays of data too.

I personally write most of my code in Python. This might not be your choice once you look at the other languages available, but it works for me, so that’s what I’m going to write about (interspersed with a little PHP and R where it’s appropriate).

So how do you start?  When someone sends you a file with an name like “thingy.py”, how do you run it?

You have options here, depending on what you want to do (run a file or code your own?) how much time you want to put it (two hours, a week, two years), and what your learning style is (reading text, watching video, doing tests, having a tutor).  Most of these options are currently available free. Here are some of them: