Popular Categories
Resource List for Learning Data Science
Data Scientist is the ‘sexiest job of the 21st century’, the ‘hottest job of the decade’, and is the fastest-growing field in tech at the moment – the impact of Data Science in today’s world cannot be overstated.
As a discipline, data science involves the collection and study of data to gain insights and information that can be used by organizations to devise effective strategies. Due to rapid technological advances, especially in areas like mobile advertising, social media, and website personalization, a massive amount of data is being generated on a daily basis. These data volumes have resulted in industries having to become data-savvy & adapt to the new landscape.
When I started learning about data science, I was overwhelmed by the ocean of resources available online. Thankfully, i had the resources available such as my professors who guided me in the right direction. Below is a list of resources that I found mostly useful — hopefully, they will kickstart your data science fascination, as they did for me.
Books/Tutorials/Videos for Learning Data Science:
- Data Science: An Introduction – Wikibook –
Beginner
- Disruptive Possibilities: How Big Data Changes Everything – Jeffrey Needham –
Beginner
- Real-Time Big Data Analytics: Emerging Architecture – Mike Barlow –
Beginner
- The Evolution of Data Products – Mike Loukides –
Beginner
- The Promise and Peril of Big Data – David Bollier –
Beginner
- Data-Intensive Text Processing with MapReduce – Jimmy Lin and Chris Dyer –
Intermediate
- Fundamental Numerical Methods and Data Analysis – George W. Collins –
Beginner
- Introduction to Metadata – Murtha Baca –
Beginner
- Introduction to R – Notes on R: A Programming Environment for Data Analysis and Graphics – W. N. Venables, D. M. Smith, and the R Core Team –
Beginner
- Modeling with Data: Tools and Techniques for Scientific Computing – Ben Klemens –
Beginner
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data – Hadley Wickham & Garrett Grolemund –
Beginner
- Advanced R – Hadley Wickham –
Intermediate
Tutorials/Videos for Learning Machine Learning:
- Machine Learning-101
- What Is The Difference Between Artificial Intelligence And Machine Learning?
- Is machine learning hard? Not always
- How to Learn Machine Learning – The Self-Starter Way
- Stanford Statistical Learning
- Coursera Stanford by Andrew Ng
- Stanford Lectures on Machine Learning
- Machine Learning Foundations
- Machine Learning Techniques
- CMU 701 by Tom Mitchell
- Introduction to Statistical Learning
- Computer Age Statistical Inference: Algorithms, Evidence, and Data Science
- The Elements of Statistical Learning
- Machine Learning Yearning
WEKA:
A tool that has a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.
- Data Mining in WEKA
- How to create & load data set in Weka
- Weka Tutorial 01: ARFF 101 (Data Preprocessing)
- How to Perform Feature Selection With Machine Learning Data in Weka
- Feature Selection to Improve Accuracy and Decrease Training Time
- Feature Selection with the Caret R Package
- Interpreting Results and Accuracy in Weka
- Class Balancing
- Text Classification with Weka using a J48 Decision Tree
- Weka Tutorial 19: Outliers and Extreme Values (Data Preprocessing)
SPSS:
It is a software package used in statistical analysis of data.
- Normality test using SPSS: How to check whether data are normally distributed
- Nonparametric Tests on SPSS
EndNote:
It is a software that gets packaged with MS Word to cite the resources that has been used in the research.
MSWord:
Needs no introduction, a software useful for generating documents.
APIs:
List of APIs that can help fetch data related to your text.