2. Design Principles for Visualizations

The data visualization pipeline involves key actions to import, filter, map, and render the representations of the mapped data.

/viz-intro

You’ve spent time with data manipulation, but what is data mapping (in the context of visualization)? This is also known as visual encoding.

/viz-intro

Image from “Visualization Analysis and Design” by Tamara Munzner

There are many guidelines for choosing visual encoding for data attributes. For example, in the image above we see that some encoding is better for categorical data (in this case, hue and shape) and other encoding is better for ordered data (in this case, saturation, luminance, size, angle, curvature, and motion).

If you are working on creating a visualization and are not sure what kind of encoding to use, there are many resources available - charts that will help you choose an encoding based on the type of data you have and the purpose of the visualization. Look, for example, at the Abela Chart Chooser. It starts with a question “what do you want to show?” with the options of comparison, relationship, distribution, and composition. When this direction is selected, it provides example charts based on the nature of the data.

/viz-intro

You can easily find other resources online that provide galleries of existing charts.

/viz-intro
/viz-intro

(See, for example, data-to-viz and chart guide)

But, we need to note that even if a visual encoding is suitable for your data and purpose, it is not always the best possibility. This is often the case with pie charts. People use pie charts to showcase data as percentages out of a whole. But pie charts are only good for comparing 2 to 3 different data points whose values are very different. They are poor for comparing between arbitrary segments.

Consider the following example. Let’s say the following 3 pie charts show the votes for 5 candidates in 3 polling stations. Notice how the orientation of the pie wedges make them difficult to compare between the 5 candidates.

/viz-intro

But when we look at the same data when simply plotted on bar charts, the differences between the candidates and polling stations are instantly apparent.

/viz-intro

Likewise, if you look at the following chart, how much bigger is Pinot Grigio than Tempranillo?

/viz-intro

Notice that to answer my question your eyes have to move to the legend then find the corresponding pie slice in the chart. Then you had to try and compare the size of the slices accounting for their differing orientation.

The estimation is easier when using a simple bar chart (it is about 3 times bigger).

/viz-intro

The ideas behind selecting visual encoding are based on experimentation into human perception. In this form of research, scientists measure the processes that happen in the human vision system as well as human memory. So let’s look at that briefly.

The figure below (from Alberto Cairo’s “The Functional Art”) depicts how your brain processes visual information.

/viz-intro

When your brain sees a visualization, it is stored in “Iconic Memory”. Iconic Memory is a short-term buffer and processor to maintain a coherent picture of the world at all times. It also perceives basic visual attributes like shape, edges, relative size, patches of color. These visual attributes are also referred to as Pre-Attentive attributes. It means you don’t have to think hard to see them. If you know what the brain pre-attentively processes, you can use that to make important data in your visualizations stand out to the user.

Iconic memory’s information is passed to visual working memory. Visual working memory is also a short-term storage (stores about 5 +/- 2 things at a time). Lastly, long-term memory kicks in to associate things in short-term memory to enable comprehension of what you are seeing.

Therefore, a good visualization is likely to:

  1. take advantage of pre-attentive visual attributes to make important features in the data stand out.
  2. not require visual working memory to hold more than 5 +/- 2 things at a time. (i.e. you don’t want the user to have to refer to more than 5 encodings to make sense of something)
  3. associate it with something that the user might be familiar with from past experience (i.e. relate the data to something from their long term memory).

For example, consider this visualization meant to show that a wind turbine is 853 ft tall.

/viz-intro

It uses the pre-attentive color attribute to draw your attention to the wind turbine, it compares it to other known structures like the Statue of Liberty, or the Empire State Building, you only need to grasp the turbine in comparison to a handful of the other items to “get” the story (you only look at about 5 of these at a time).

Pre-attentive visual attributes - are those that are processed in sensory memory without our conscious thought. It takes our brain less than half a second to process a pre-attentive property of an image. Four basic visual properties that can be defined as pre-attentive include: Form, Movement, Spatial Positioning and Color.

Examples of Form include: orientation, curvature, length, width, added marks, numerosity, shapes, size, and spatial grouping.

/viz-intro

Examples of movement include: Flicker, Velocity, Direction.

/viz-intro
/viz-intro
/viz-intro

Examples of Spatial Positioning include:

/viz-intro

We often use the word Color to mean a combination of: hue, saturation and value.

Hue refers to the origin of the colors we can see (red, green, blue, etc). Saturation describes the purity/vividness of a hue. Value describes the lightness or darkness of a hue.

/viz-intro

Color is a particularly tricky pre-attentive attribute. How well do people identify specific shades of colors? This is an interesting question regarding resolving power - how well can we resolve the different data associated with two very similar points. This paper compares mean error when users view data visualizations using a variety of color mapping schemes [Bujack2017]. As summarized in the figure below, the results of the study suggest that Blue-Orange Divergent provides the most resolving power.

/viz-intro

In addition to guidelines for visualizations that are based on research into visualizations and perception, we should also look at graphic design guidelines and principles which are always relevant when designing anything with a visual component.

/viz-intro

Rules of graphic design guide people’s attention in specific ways that help you tell a story.

/viz-intro

Gestalt principles direct how we group and associate items together (i.e. a data point and its label), so we should study them to create better visualizations.

Be careful when you choose fonts for your visualizations…

/viz-intro

Some simple rules of thumb to help you choose fonts:

  1. Generally Sans-serif fonts (e.g. Helvetica, Arial) are good for on-screen text.
  2. Serif fonts (e.g. Times, Georgia) - are good for printed text (because print often has more resolution than computer screens).
  3. Monospace fonts (e.g. courier) are good for when you need exact alignment of text characters (e.g. source code or numbers in tables).
  4. Stylish fonts are best left for party invitations.
  5. Lower case words are read faster than words than upper case.
  6. Individual letters and nonsense words like UA1138 are read faster in upper case.

Following graphic design concepts improves readability. Consider the following data tables (a table is still a visual representation). What differences between these tables can you identify?

/viz-intro
/viz-intro

In summary, for most simple charts, follow these guidelines:

  1. Choose an appropriate visual encoding
  2. Make sure all axis are well labeled in terms of what the axis is and the units / categories
  3. Make sure there is a meaningful title
  4. Make sure there is a legend explaining what the various lines / points / colors mean
  5. Make sure all text labels are positioned near the items they describe
  6. Avoid 3D in simple charts
  7. Avoid fully saturated colors (they are usually unpleasant to look at)
  8. Choose the background color of the chart carefully (it should fit within its context and not clash with the colors you use for encoding, think about color blind people)
  9. Make sure items (lines, marks, etc.) can be differentiated from each other, look at the chart from afar, can you tell them apart?
  10. If you place values next to their data points make sure they are readable (in color and size)