Tikalon Header

Cartograms

July 14, 2016

They say, "a picture is worth a thousand words." As a scientist, I need to scan many journal articles for relevance, and the best way to do this is by looking at the included graphs and other images. This is useful, especially, in papers not written in English, so my limited foreign language skills are not overly taxed. That's why it's important for scientists to clearly render their images.

In our computer age, Information graphics (infographs) and scientific visualization have become research areas in themselves, but they have a long history. When discussing graphical representation of data, it's traditional to mention Charles Joseph Minard's famous cartograph, a 62 x 30 centimeter lithographic depiction of Napoleon's 1812 Russian campaign that shows troop numbers, their movement, and temperature (see figure).

Charles Joseph Minard's cartograph of Napoleon's Russian campaign, 1812
Charles Joseph Minard's cartograph of Napoleon's Russian campaign, 1812 (Click for larger image). The primary information conveyed is the continued troop loss during both the advance and the retreat, which is easily seen. The temperature is given on the Réaumur scale. In this temperature scale, the freezing point of water is 0 degrees, and its boiling point is 80 degrees. (Via Wikimedia Commons.)

While in elementary school, my daughter had an assignment to produce charts and graphs of data. Having a scientist father was a burden most of the time, but it helped her in creating a novel bar chart of the numbers of bathroom fixtures in the United States. Instead of bars, she produced arrays of sinks, toilets, and bathtubs to lengths that matched their numbers. She was able to do this using a primitive "photoshop" style program that worked in the Windows operating system of that time. This technique, as applied to a chart of membership in the major science and engineering professional associations, is shown below.

Membership in science and engineering societies
Membership in the four major science and engineering professional associations, the American Physical Society (APS), the Federation of American Societies for Experimental Biology (FASEB), the American Chemical Society (ACM), and the Institute of Electrical and Electronics Engineers (IEEE). I'm a member of the APS and the IEEE. (Created using Inkscape.)

Popular among bloggers is the word cloud, a representation of the frequency of keyword use in which the words themselves are sized according to their importance (see figure). While there are several online word cloud generators, the free and open-source statistics programming language, R, has a toolset that allows detailed development of word cloud representations.

Figure captionA word cloud of Web 2.0 buzzwords.

(Modified Wikimedia Commons image.)

One of the more interesting data representations is the cartogram, which is typically a map in which the map elements have been distorted so that their areas scale with a variable, such as population, income, carbon emission, etc. A simple cartogram of population is shown in the figure. A review article on cartograms has been posted to arXiv by doctoral candidate, Sabrina Nusrat, and professor Stephen Kobourov of the University of Arizona Department of Computer Science.[1]

Cartogram of the US population in 2010 (US Census data)
A simple cartogram of the US population in 2010, as derived from US Census data. The huge size of New Jersey demonstrates the area scaling of such maps (Via US Census Web Site.)

As can be imagined, area cartograms can be created in a panoply of pleasing forms that either attempt to reproduce the spatial arrangement of the original map by distorting its elements to scale with the variable of interest; or, to just place shapes of appropriate area on the plane. As can be seen in the following example of Nusrat and Kobourov, circles of appropriate area have been placed in the relative location of the US states on the plane to indicate the concentration of Starbucks and McDonald's locations.

Concentration of Starbucks and MacDonald's locations in the United States
Concentration of Starbucks and McDonald's locations in the United States. The circle size is indicative of the number of Starbucks coffee shops, while the intensity of the shading indicates the number of MacDonald's restaurants. The correlation is easily seen through this representation. (Figure 16 from ref. 2, used with permission.)[1-2]

Nusrat and Kobourov list quite a few cartogram techniques and historical examples in their paper.[1-2] The example of the above figure is a Dorling type cartogram, and it inspired me to attempt a similar cartogram program. My C language program (source code in my typical amateur coding style can be found here), accepts as an input a CSV format file of states and some data attribute of the states, and it outputs an SVG image as shown below.
A Dorling-type cartogram of the number of representatives of each state.
A Dorling-type cartogram of the number of Representatives of each state. The radius of the circles is scaled, not their area. (Author's program)

Unlike a true Dorling cartogram, this program overlaps circles, thereby preserving the map shape. While this might be a valid reason, the actual reason is that this was easier to code. Some example data files are the number of representatives in the house of representatives, and population. A skilled programmer should be able modify the program to give a true Dorling representation, or experiment with other shapes.

References:

  1. Sabrina Nusrat and Stephen Kobourov, "The State of the Art in Cartograms," EuroVis 2016 - the 18th EG/VGTC Conference on Visualization, Groningen, the Netherlands (June 6-10, 2016). To appear, Computer Graphics Forum, vol. 35, no. 3 (2016).
  2. Sabrina Nusrat and Stephen Kobourov, "The State of the Art in Cartograms," arXiv, May 30, 2016.
  3. Some interesting maps, including a few cartograms, can be seen at the David Rumsey Map Collection, Cartography Associates.