Software Engineering for Data Scientists: From Notebooks to Scalable Systems

Original price was: $69.99.Current price is: $45.00.

0 People watching this product now!


Data science happens in code. The ability to write reproducible, robust, scaleable code is key to a data science project’s success—and is absolutely essential for those working with production code. This practical book bridges the gap between data science and software engineering, and clearly explains how to apply the best practices from software engineering to data science.

Examples are provided in Python, drawn from popular packages such as NumPy and pandas. If you want to write better data science code, this guide covers the essential topics that are often missing from introductory data science or coding classes, including how to:

Understand data structures and object-oriented programming Clearly and skillfully document your code Package and share your code Integrate data science code with a larger code base Learn how to write APIs Create secure code Apply best practices to common tasks such as testing, error handling, and logging Work more effectively with software engineers Write more efficient, maintainable, and robust code in Python Put your data science projects into production And more

From the brand

oreillyoreilly

Explore more Data Science

Start learning with O’Reilly

More From O’Reilly

Sharing the knowledge of experts

O’Reilly’s mission is to change the world by sharing the knowledge of innovators. For over 40 years, we’ve inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.

Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.

Publisher ‏ : ‎ O’Reilly Media
Publication date ‏ : ‎ May 21, 2024
Edition ‏ : ‎ 1st
Language ‏ : ‎ English
Print length ‏ : ‎ 257 pages
ISBN-10 ‏ : ‎ 1098136209
ISBN-13 ‏ : ‎ 978-1098136208
Item Weight ‏ : ‎ 2.31 pounds
Dimensions ‏ : ‎ 7 x 0.5 x 9.5 inches
Best Sellers Rank: #164,456 in Books (See Top 100 in Books) #22 in Data Mining (Books) #25 in Software Testing #42 in Computer Systems Analysis & Design (Books)
Customer Reviews: 4.5 4.5 out of 5 stars (26) var dpAcrHasRegisteredArcLinkClickAction; P.when(‘A’, ‘ready’).execute(function(A) { if (dpAcrHasRegisteredArcLinkClickAction !== true) { dpAcrHasRegisteredArcLinkClickAction = true; A.declarative( ‘acrLink-click-metrics’, ‘click’, { “allowLinkDefault”: true }, function (event) { if (window.ue) { ue.count(“acrLinkClickCount”, (ue.count(“acrLinkClickCount”) || 0) + 1); } } ); } }); P.when(‘A’, ‘cf’).execute(function(A) { A.declarative(‘acrStarsLink-click-metrics’, ‘click’, { “allowLinkDefault” : true }, function(event){ if(window.ue) { ue.count(“acrStarsLinkWithPopoverClickCount”, (ue.count(“acrStarsLinkWithPopoverClickCount”) || 0) + 1); } }); });

Reviews

10 reviews for Software Engineering for Data Scientists: From Notebooks to Scalable Systems

  1. Joe Faith

    The Missing Manual for Early Career Data Scientists
    As a professor who regularly teaches courses in AI, computer science, data science, and information systems, I can’t overstate how important this book is for anyone preparing for a career in data science. I work with students from a variety of backgrounds, and I constantly see the same gap: solid technical skills, but limited exposure to practical software engineering best practices. This book fills that gap brilliantly.The author’s approach is refreshingly accessible and practical. The book doesn’t just teach you how to write code; it teaches you how to write reproducible, robust, maintainable code that will actually work in a production environment. The focus on real-world Python examples (using pandas, NumPy, and other core packages) ensures that readers can apply these lessons immediately.What I especially appreciate as an educator is that the book meets students where they are. Whether you’ve just finished your degree, are transitioning from another field, or have taught yourself the basics, this book will make you a more effective and confident practitioner. Even more experienced data scientists will find valuable guidance on working with larger codebases and collaborating with software engineers.In my opinion, every early career data scientist should read this book, preferably before they start their first industry job. It’s packed with practical advice on documentation, packaging, testing, error handling, security, and more. These are the skills that set apart great data scientists from the rest.If you’re a student, a recent graduate, or even a working professional looking to up your game, this book deserves a spot on your desk. I’ll be recommending it to all my students, and I strongly suggest you don’t start your career without it.

  2. Luis

    Practical and highly recommended for growing Data Scientists
    As a machine learning engineer with four years of experience working in a small data science team, I found Software Engineering for Data Scientists to be incredibly relevant and useful. The book addresses the real, day-to-day coding challenges we face and offers practical solutions that can be applied right away. It’s especially helpful for those of us looking to write cleaner, more maintainable code and adopt more structured development practices. I also appreciated how it points you to additional software engineering resources for continued growth. Ideal for anyone in data science or ML looking to elevate their coding practices.

  3. Azathoth

    Decent for a beginner
    I’ve been a data scientist for about 4 years and have worked with a lot of colleagues that are terrible at coding and best practices in DS. I really was hoping this book was more in depth. If someone is new to data science and does not know what a linter is and is a poor programmer, this book is a good one to read. Unfortunately I didn’t learn anything from this book so it’s targeted to beginners or people learning DS.

  4. Naif A. Ganadily, MSEE

    A True Software Engineering Guide for Data Scientists
    I have been meaning to write a review for a while now. To give you an answer if you should or should not buy this book as a Real Guide to Software Engineering as a Data Scientist, my answer is (YES, YOU SHOULD BUY THIS BOOK!). My background is straightforward. I graduated with a BS in Electrical Engineering (Electronics and Telecommunications), where I was mainly on the hardware side of engineering, then transitioned to an MS in Electrical Engineering (Specialized in AI), where all my coursework and projects were in Artificial Intelligence, where I learned how to be a Data Scientist and an AI Engineer, so you can see I skipped Software Engineering. The moment I knew that I struggled with these essential skills was during my first post-master’s job as an AI Consultant, where I was having a lot of issues with production-ready code, large codebases, testing,…etc.This book saved my career. It made me more of a software engineer, which, in my opinion, should be the foundational skill before entering the data science field. Now, back to the book: I found it easy to understand and follow. It was insightful about which tools to use, when to use them, and how to use them during the coding and scripting process of production-ready projects.The only issue I have with the book is not with the author but with the publishing company O’Reilly. Some books are colored, and some are not, which I find inconsistent with publishing best practices.

  5. Dylan

    Great book!
    Heard about the book on the MLOps podcast and decided to buy it which overall enjoyed!

  6. jeff

    Highly recommended especially for a new ds
    Outstanding book! If you are a new ds you will benefit from this book tremendously.

  7. Travis Strawn

    Covers all the important topics succinctly and to the point. Great book!
    Data analysts and other data professionals should read this book, unless they are well-experienced software engineers as well. It covers a lot of essential ground that’s all very important to know. Highly recommend it!

  8. Rob

    Extremely clear, great way to brush up or learn the fundamentals of SWE for Data Scientists
    This book will give you a very clear exposition of key SWE concepts to improve your coding as a data scientist, and to better understand how to work with other software engineers. It won’t teach you everything you need to know to be a SWE or become an expert python programmer. The concepts are laid out clearly, and intuitively with a lot of care to explain why, not just what. The examples are also easy to follow and illustrative, and are very DS oriented.

  9. Jéssica Mello

    I came from Electrical Engineering with some Data Science basis in the Academia, then I worked as an ML Engineer for some time. I can say that this book provided me valuable and practical insights, which can be used in different situations to tackle a bunch of “quality problems” in software. Also, the libraries suggested are a must. The insights range from “security” to “performance”. This book provides objective information without been vague – it follows necessary scientific methodology and their good practices. I really recommend it.

  10. Denis Burakov

    Initially came across this book through an early release, yet the full book experience was somewhat lacking. I was expecting this book to be a companion to data scientists transitioning to ML engineering roles, but it was a bit more like a collection of tips for improving code and project quality. To give the author full credit, it is a well structured book with a lot of helpful examples. If you already have extensive applied experience, you would probably want to read the DevOps for Data Science book.

Add a review

Your email address will not be published. Required fields are marked *

Shop Woodmart

Related Products