Python Data Visualization Libraries
The two tools I use for data visualisation are Excel and Tableau. They are easy to share with most business users. They have been my go-to tools. But now I want to make a shift. They're not as efficient when exploring data and understanding the data I'm dealing with. Building charts in either of these can be time-consuming. That's why I'm now going to turn to Python for quicker data visualisation.
You might think, "Aren't Excel and Tableau quick for basic charts?" True, for simple visuals, they work fine. But the process is still slower than I'd like. I know using Python will be slower at first- until I get the hang of it. Once familiar with it, I can whip up a chart much more quickly than in Excel or Tableau. If I had to refresh the data, simply a single click to run the code. As you can imagine, it's a significant time-saver for repetitive tasks.
So, I've been digging into Python's data visualisation libraries. I wanted to know which ones are worth learning and in which sequence. Hence, in this post, I'll share with you:
- The Six Common Python Data Visualization Libraries
- The Learning Path I've Charted for Myself
The Six Common Python Data Visualization Libraries
There are many data visualisation packages out there. It can be overwhelming, but I've done some of the legwork, so here are six Python libraries that are pretty much the go-to tools for creating visuals:
- Matplotlib
- Seaborn
- Bokeh
- Plotly Express
- Plotnine
- Pandas
I have had exposure to some of these libraries, primarily through the Google Advanced Analytics Professional Certificate course I'm taking. These two are stables in the data viz science.
Let's dive into what makes each of these libraries stand out.
Matplotlib
This is like the OG (original grandfather) of Python visualisation tools. It was first released in 2003, so it's pretty old. It's super versatile. It allows users to create just about any static chart one can think of. The learning curve can be steep, but once the user gets the hang of it, they can customise the charts to their heart's content, so they say!
Seaborn
Seaborn is like Matplotlib's stylish cousin. It builds on Matplotlib but makes users' charts look better with less code. It's also great for statistical data and can make complex visualisations much more digestible. Plus, it integrates nicely with Pandas data structures. Pandas is a Python library for data manipulation and analysis.
Bokeh
Bokeh is all about interactive and web-ready visuals. It lets you build complex dashboard-type visualisations that users can interact with directly in their web browsers. It's rather code-intensive compared to other libraries.
Plotly Express
Similar to Bokeh, Plotly is great for interactive charts. The good news is that it is more user-friendly and easier to learn than Bokeh. It's part of the larger Plotly ecosystem and lets users create some sleep and dynamic visuals without getting too bogged down in code. Yeah!
Plotnine
This is inspired by R's ggplot (grammar of graphics plot). For those who don't know, R is a programming language for statistical computing. It is widely used in academia. R's ggplot is a popular data visualisation tool. If you come from R, then you might like this package. Its syntax is somewhat different if you're coming from a non-R background. Apparently, it's powerful. It allows you to build plot pieces by piece, much like Legos, where each block is an element of a plot.
Pandas
Yes, for those who know, Pandas is mainly for data manipulation and analysis, but it also has some handy plotting capabilities. It's great for quick and dirty charts when the user is knee-deep in data wrangling and just needs a quick look at the data.
The Learning Path I've Charted for Myself
Alright, let's talk about the learning journey I've set out for myself with these Python libraries. I want to find the right balance. I don't want to try to tackle too much at once. I still have a few other things I'm working on.
Starting off, I'm leaning towards Seaborn. It's friendly with Pandas DataFrames (these are like tables in Excel), which is great since I'm pretty comfortable with Pandas already. Seaborn is a smoother, more straightforward intro to visualisation compared to Matplotlib.
Once I've got a good grip on Seaborn, I plan to dive into Matplotlib. Or maybe even learn a bit of it when I'm learning Seaborn. This is because Seaborn is built on Matplotlib, and I think learning side-by-side can help me understand how they complement each other.
Next up will be Plotly Express. I don't think I need to learn this in the short term for my current purpose, which is to be able to shorten the time required for exploratory data analysis. However, I do like a good interactivity. Zooming, hovering, and interacting with charts add a whole new dimension to data visualisation.
I'm not going to touch Bokeh and Plotnine. Bokeh's complexity seems like it might be more than I need at the moment, and plus I can achieve interactivity charts with Plotly Express much more quickly, so I don't see the need to learn Bokeh. As for Plotnine, since it is closely tied to R's way of doing things, and I have no interest in picking up R, it doesn't make much sense for me to learn this package.
So, here's the plan in a nutshell:
- Star with Seaborn for easy and efficient charting
- Move on to Matplotlib for more in-depth customisation
- Explore Plotly Express to add some interactive flair to my visualisations
Wrapping up, I'm genuinely excited to start cranking out charts with Python. Sure, there's a bit of a learning curve, but the payoff—quickly generating insightful visuals without getting bogged down in the manual work—sounds like it'll be well worth it. Here's to less monotony and more fun in data exploratory analysis!