This critique of a visualization published by The Economist was written by responding to a series of questions about static visualization strengths and weaknesses.
1. List the source of the visualization. Include the creator and what you know of their background.
This visualization was published by The Economist on March 4, 2014 in an article entitled Of price and place – The cost of living around the world. It uses data in the Economist Intelligence Unit's Worldwide Cost of Living Index 2014 report. It's authors are listed as P.J.W. and A.C.M., but additional background information about these individuals was unavailable. The Economist itself publishes several visualizations that are generally considered credible and effective. The visualization was accompanied with the following description:
SAYONARA, Tokyo. Singapore is now the world's most expensive city, according to the bi-annual cost of living index from the Economist Intelligence Unit, our corporate sibling. The Singapore dollar's appreciation and high transport costs have propelled it to top spot. Tokyo and Osaka, which ranked first and second last year, have seen the biggest falls in costs because of a cheaper yen. The index is a weighted average of the prices of 160 products and services, with New York's figure set to 100 to provide a base for comparisons. Paris rose six places from last year, reflecting a recovery in European prices. Strikingly, Tehran experienced a steep rise in costs over the past five years as economic sanctions began to bite. Mumbai offers the best value for money. As for the seeming anomaly of why Caracas should be one of the priciest cities, it is because the Venezuelan bolívar is pegged to the dollar: if black market rates were applied, Caracas would comfortably become the world's cheapest city in which to live.
2. Who is the intended audience? What is its intended goal or purpose?
The intended audience is the Economist's web and print readers, who tend to follow world affairs and likely have an idea of the conditions in each city (or at least each city's region). Based upon its most apparent visual message and the accompany text, the primary goal of the visualization is to highlight the cities that are the most expensive to live in. A secondary goal appears to be providing insight into which cities have moved up or down in the rankings based upon changes in their relative cost of living (normalized against New York for each of three years).
3. What information does this visualization represent?
The index reflected in the visualization was calculated by surveying the prices of 400 items in each city, then dividing each city's total by that of New York for the same year, then multiplying by 100; a value of 130 means that the cost of living is 30% higher than New York for that given year. The value for three separate years (2003, 2008, and 2013) are shown, each normalized within themselves but not across other years. The visualization also shows each city's rank out of 131 surveyed.
4. How many data dimensions does it encode? Are the encoding mappings appropriate?
The visualization encodes 3 dimensions/variables. The city name is a categorical (nominal) variable, represented appropriately by a standard text label and evenly spaced (and sorted) along the y-axis; its 2D position does not imply quantitative values. The cost of living index for each city is a quantitative (ratio) variable and is represented by 2D position in the horizontal direction. This treatment is appropriate since 2D position is a pre-attentive attribute that affords easy comparisons, especially vertically and horizontally. Finally, the year for each datapoint is a quantitative (interval) variable represented by color rather than a more traditional 2D (horizontal) position.. While the colors are distinct enough to be visually distinguished, they do not have inherent meaning (their association must be memorized), nor do they give a sense of the size of the interval as 2D position would.
5. List several tasks, comparisons or evaluations it enables.
The primary task that the visualization affords is the ability to compare relative cost of living between two cities. Comparing 2013 values is easiest due to the visualization sort order, but comparing cities during the other years is also relatively easy due to the strength of the 2D position encoding; a reader would find two cities, then compare the distance of the appropriately colored dot for each. Not only can the reader determine order between two cities, but they can determine the ratio between their values (e.g. "30% more expensive" rather than just "more expensive"), making this more powerful than simply an ordered list.
A reader can also determine how a city's cost of living has changed over time relative to that of New York by comparing the horizontal positioning of that city's own three dots. This is unintuitive, however, and only makes sense if the reader notices the legend.
6. What principles of excellence best describe why it is good?
The visualization does a good job of showing the data. Apart from the excessive gridlines, it does a fair job of maximizing the data-ink ratio and clearly showing each individual piece of data in a reasonable amount of space. It also provides a visual means for quantitative comparisons in a way that a table alone would not; the visualization therefore reveals meaning and insights in the data.
The visualization is also successfully avoids distorting what the data have to say. None of the data points are misrepresented in the sense that it has a "lie factor" (Tufte 1983) of 1. A reader may not immediately notice that there are gaps (in comparison to the full 131-city data set), but the inclusion of the rank indicates when cities are missing.
It also encourages the eye to compare different pieces of data. The alignment and ordering of the cities invites the user to follow the data points down the visualization, especially the 2013 data points, in order to make comparisons. Within a single city, it is also easy to compare the three years to one another because of their 2D position.
7. List at least three strengths and three weaknesses of the visualization
One strength of this visualization is the ease with which it allows comparisons of single years across cities. Once the reader chooses a color (year) to focus on, they can easily scan down the visualization and compare values due to the effective encoding as 2D position. Another strength is that it doesn't use any 3D or perspective effects that introduces data distortions; it is clear that the creators opted for simplicity over flashiness. Also, while it's use of color to encode survey year is misguided in my opinion, it is at least effective in the sense that the colors are distinct and can be visually distinguished with ease; it's easy to mentally attend to the blue dots to follow the path of the 2008 data.
Despite its strengths, the graph has multiple faults. The encoding of time as color instead of the traditional positioning along a horizontal axis is suboptimal, especially given that time has an inherent linearity. This encoding is not obvious to the user, who must understand the legend before understanding the data. A second weakness, based upon its encoding of time, is the difficulty a reader would have in comparing the trends of multiple cities; at a glance, it's difficult to determine which cities continuously decreased or increased, or which cities increased and then decreased, for example. A third weakness is the frequency of the gridlines, particularly the vertical gridlines; many could be removed while still preserving the ability to compare cities. Finally, while not a distortion, it's worth noting that each year is normalized against New York for that same year. A flat trend line might be misinterpreted as stagnation (i.e. neither inflation nor deflation), but this simply means it changed at the same rate as New York; such a determination is impossible without insight into New York's absolute changes.
8. Does the visualization serve its intended purpose, in your opinion?
The visualization does serve its intended purpose of showing an ordered comparison of the most expensive and inexpensive cities in 2013. It also accomplishes the goal of allowing readers to understand how each city changed across survey years, though not as effectively or elegantly as another design might allow.
9. Can you suggest any improvements?
While the visualization is strong at allowing comparisons of single years across multiple cities, it fails to enable easy visualization of each city's relative changes over time. The encoding of time as a color rather than a traditional horizontal axis is unexpected and non-intuitive, and doesn't support reading the data as a linear story. A visualization that uses small multiples (Tufte 1990) would preserve the 2D position encoding of cost of living index while allowing time to also be encoded via 2D position. This approach would also better leverage the brain's ability to chunk information, as the entire trend can be stored as a single chunk (Few 2009).
10. Why do you like this visualization
I like the visualization because it still ultimately accomplishes its goal of illustrating relative cost of living across several cities, a task made ever easier by its ordering. I also found it very interesting the be able to see the changes in each city over time and relate the trend to their regions' recent history.
"Of Price and Place." The Economist. The Economist Newspaper, 04 Mar. 2014. Web. 14 Apr. 2014.
Few, Stephen. Now You See It: Simple Visualization Techniques for Quantitative Analysis. Oakland, CA: Analytics, 2009. Print.
Tufte, Edward R. The Visual Display of Quantitative Information. Cheshire, Conn. (Box 430, Cheshire 06410): Graphics, 1983. Print.
Tufte, Edward R. Envisioning Information. Cheshire, Conn. (P.O. Box 430, Cheshire 06410): Graphics, 1990. Print.