Python for Data Analysis

_config.yml

See it on Amazon.

I’ve just read Python for Data Analysis by Wes McKinney, the creator of Pandas As a programmer, I think that data analysis is becoming more and more a required skill. Python seems to be suited for this task, and I wanted to explore this langugage for a long time. This book was the perfect occasion.

What’s inside?

The first three chapters helps you setup your working environment, required libraries and tools. Then get some basic operations done, like loading data, filtering, etc… Before reading this book, I had no prior experience with Python at all. However, the simplicity of the language, samples, and a nice appendix about Python Language essentials was the only things required to get started !

While exploring this book, and installing Enthought Canopy (the book was not updated for the new name of EPDFree), I discovered that Enthought also delivers a Python plugin for Excel. I’m curious to test it in conjunction with Excel PowerBI Tools (PowerQuery, etc…).

I didn’t take the time to read the latest chapters (Financial & Economic Data, Advanced NumPy), as I didn’t need for now. My next reading in this space will be another O’Reilly book, Think stats. I feel that I need an update on my statistics background, and this book seems great on that topic.

Chapters 6 and 7 are focused on getting your data ready for analysis. I was quite surprised to see how it’s easy to parse CSV, tab-delimited or json data in python.

Chapter 8 show you how to visualize your data with matplotlib. Even if the graphic style of charts generated by library is not as polished as I wish, plotting is quite easy. This book references other solutions available within the Python ecosystem, but I didn’t take the time to investigate them.

Bottom line

This book is a great way to get started to data analysis with a fairly common language. All the basics operations are quite easy to realize. This book covers how to get your data in and out, which can be really useful to reuse your computation results in another software (like Excel). Even if you don’t have prior knowledge of Python, it’s a great way to play with your data !

Disclosure: O’Reilly send me a free copy of this ebook in exchange for this review.

Written on October 3, 2014