Because I’ve worked in a role where I present data visually for the past 7 years, I’ve come across a poorly presented graph or two of my time. I still wanted to think back to the visual I presented at the beginning of my career. My goal now is to help others create beautiful visualizations.

The purpose of this article is to improve the quality of Python Plotly charts if you are already somewhat familiar with the language. Read on to take your graphics to the next level.

I have yourbetter visual features, two books by Cole Nussbaumer Knaflic; ‘Storytelling with Data: A Data Visualization Guide for Business Professionals‘and’Storytelling with data: Practice“have been exceptional sources of information. I would recommend it to professionals who create charts on a regular basis. These two books complement each other well. They both clearly state what makes an interesting chart and describe this process step by step, explaining each change.

In this Plotly tutorial, I use the step-by-step iterative method practiced in the books and tailor the guide specifically to Python Plotly. I use a set of car features that can be downloaded on my own Git archive or Kaggle. The rest of the code in this tutorial is included in Git.

You can go through six different versions of the same chart. Let’s start with the most rudimentary chart Plotly provides off the shelf, to the end of a customized and commented chart. Explanations of fundamental changes are given at each stage.

There are two versions of Plotly:

Plotly Express – a higher level graphics package that is easier and faster to use than Graph Objects, but with fewer features.

Plotly Graph Objects (GO) – a lower level graphic package that usually requires much more coding because it is much more customizable.

This diagram tutorial focuses only on building a scatter plot in Plotly Graph Objects. This package offers complete freedom to change any part of the chart that Plotly Express does not.

The question we want to find out about the material is:

How does the ratio of horsepower to weight vary by origin?

It is worth mentioning how Plotly encompasses its graph object. An easy way to think about it is to describe a graph built on four floors.

  • Level 1 – Create an empty chart object.
  • Layer 2 – Add data points.
  • Level 3 – Customize your visuals.
  • Level 4 – Label your visuals.

Follow the first step by following the first step on the first floor from the top.

Create an empty chart object on layer 1 go.Figure().

Use on floor 2 add_trace(...)add layer data points to a blank canvas. You make all the adjustments to the data points in this command, such as color, size, transparency, etc. For more advanced charts, you almost certainly need to add multiple levels of data points to the same chart (more on this later).

Car weight versus horsepower code – iteration 1 Image by the author
Car weight against horsepower – iteration 1 Image by the author

In the chart above, you’ll notice a couple of things. Plotly Graph Objects does not create a graph or axis title. They need to be defined separately, which will be explained in the next step.

One of the most critical concepts in data visualization is clearly highlighting the displayed data points and reducing the noise caused by less essential elements. For example, I usually use a white background for my plots and a light gray color for the axis titles and text.

For this iteration, we are going to make style changes so that we can make a level 3 use update_layout(...) completely customize the appearance of the chart frame. Here you update everything that is not a data point, so aspects like headings, axes, check mark shape, grids, etc.

For this iteration, we are going to make the following change:

  • Add chart title and axis titles
  • Put the y-axis ticks in better shape
  • Make sure the measurements on each axis are clear
  • Change the color of the axis line from white to light gray
  • Change the background color to white
  • Change the color of all text on the chart to light gray
  • Change the color of the data points to a darker blue to make them stand out more
Car weight against horsepower – iteration 2 Image by the author
Car weight against horsepower – iteration 2 Image by the author

Instead of passing the value to the argument argument, you will also find that I have also decided to create a new column in the data frame to hold the variable. Therefore, we have changed the color argument to a list and used only the first element. This may seem like a long way to do this, but it’s a much easier way when there are multiple categories in a color column, as required in the next step.

Visualizing class data in Plotly Graph Object is not as straightforward as passing a variable to a category display parameter (Plotly Express). Instead, we need to create separate data layers (or traces) for each variable value we want to group, and at the same time cover them in a chart.

To create separate groupings for a car’s origin variable, we need to produce different data point levels for each of the following values: USA, Europe, and Asia. This can be easily achieved by using a for loop that repeats the unique value of each output column for area in df["Origin"].unique():. A new data frame is created in each loop that contains data for only one category df_plot = df.loc[df["Origin"] == area].copy().

Since we are presenting classes in a chart, we need to pass the names of the classes to the name argument so that the caption can be marked correctly.

For this iteration, we plan to make the following changes:

  • Divide the data points into categories based on the origin variable
  • Select the color of each value in the original column
Car weight against horsepower – iteration 3 Image by the author
Car weight against horsepower – iteration 3 Image by the author

The chart feels pretty busy, especially in the lower left corner. We will address this in the next iteration by displaying each class in its own subdomain.

Currently, it is not easy to distinguish between trends in each category. We create a bottom curve, which means we have to pass the x- and y-axis variables update_xaxes(...) and update_yaxes(...)instead of update_layout(...). This is because there are now three x-axes and three y-axes, and the previous code would have updated only one x & y pair. Alternatively, we could have updated each individual axis xaxis2 = dict(...), yaxis2 = dict(...) and so on.

For this iteration, we are going to make the following change:

  • Create a subarea for each origin category
Car weight against horsepower – iteration 4 Image by the author

LEAVE A REPLY

Please enter your comment!
Please enter your name here