The Best Introductory Data Science Books
- An Introduction to Statistical Learning
- Numsense! Data Science for the Layman: No Math Added
- Doing Data Science: Straight Talk From the Front Line
- Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die
- Weapons of Math Destruction
Since data science is an interdisciplinary field involving many separate and seemingly unrelated topics, it can be confusing to choose which books are the best ones for getting started in this field. Data science is the fusion of statistics, mathematics, probability, machine learning, predictive modeling, and computer programming. The mix can also incorporate a healthy dose of business, economics or politics, depending on the organization the data scientist is working for at the time.
Anyone who can grasp these topics and how they all fit together can use the combined knowledge to solve some of the most compelling problems that governments, businesses and civic organizations experience. The following 5 data science books are all useful for coming to this sort of understanding.
Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
This book is at the top of our list of best introductory data science books because it is one of the most intuitive texts available on the topic of R programming as it relates to statistical methods. Four statistics professors wrote this classic text, which was published in 2013 but recently updated in 2017. An in-depth previous knowledge of upper-level mathematics is not necessary for success with understanding this book, so it’s an excellent starting point for readers in a variety of disciplines who need to understand data science.
Kenneth Soo and Annalyn Ng
This book is an introductory text on the topics of machine learning and predictive analytics for business. It is ideal for readers who want to understand data science without getting excessively bogged down in math, statistics or probability. This book’s strong point lies in the clear examples it gives of commonly used models for business situations. It is not the optimum book for readers who need step-by-step tutorials on how to actually accomplish the predictive analysis discussed.
Cathy O’Neil and Rachel Schutt
This book was based on and inspired by the Introduction to Data Science class taught at Columbia University. The pages are packed with real-world algorithms and code that are used by tech giants such as Google, Microsoft, and eBay. Some upper-level mathematical expertise is required for getting the most out of this book.
This widely read book demystifies the little-understood topic of predictive analytics. Professors of various classes at 30+ different universities have recommended it to their students. The book is available in 12 different languages. The author of the book was a professor at Columbia University, is currently a well-known authority on the topic and is also known for being a founder of the Predictive Analytics World conferences.
This book is much different than the others listed above; it is a warning about how dangerous the misuse of data science is. Most laypeople don’t understand math or statistics well enough to comprehend the ways their data is being used — and possibly misused. This book’s value stems mainly from the fact that a mathematics expert, who has considerable knowledge in this field, has communicated her alarm about what she knows. No prior knowledge of mathematics or data science is required to understand the concepts covered in this book.
Mastering the information in these books will teach readers about the core concepts that are central to understanding data science. These topics include mathematics, statistics, linear algebra, computer programming, probability and machine learning. Anyone who’s interested in learning more about data science will benefit from reading these 5 data science books.