The New York Times had a piece over the weekend discussing the how computer science curricula are limited in their capacity to teach distributed computation and data mining:
For the most part, university students have used rather modest computing systems to support their studies. They are learning to collect and manipulate information on personal computers or what are known as clusters, where computer servers are cabled together to form a larger computer. But even these machines fail to churn through enough data to really challenge and train a young mind meant to ponder the mega-scale problems of tomorrow.
Besides being an advertisement for Facebook and Google internships, it does raise the question of how schools can adopt these technologies quickly enough to teach them. There have been lots of industrial partnerships and government grants for research clusters, but these are far from a standard undergraduate class on the topic. I would love to see Cloudera or a similar company partner with a hardware provider to make clusters affordable and easy to configure, while data scientists can make sure that they come pre-installed with some interesting data (Wikipedia, Twitter, etc.). With a consistent installation across institutions, professors can write and teach data science without the immense operational overhead of setting up a cluster and getting it operational.
5 thoughts on “Teaching data science”
sounds like this is already possible with cloudera + amazon’s ec2, thanks to amazon’s educational grants.
perhaps a group effort to start a shared elastic block store that has interesting, free data that can be easily mounted to an ec2 instance?
i would sign up for that class in a second!
.) WONDERFUL Post.thanks for share..more wait ..
.) I’m impressed, I must say. Really rarely do I encounter a blog that’s both educative and entertaining, and let me tell you, you have hit the nail on the head. Your idea is outstanding; the issue is something that not enough people are speaking intelligently about. I am very happy that I stumbled across this in my search for something relating to this.
.) There is noticeably a bundle to know about this. I assume you made certain nice points in features also.