Text Topography
May 16, 2022
Most of you likely read the title of this article as "Text Typography."
Typography is choosing
fonts and arranging its displayed
characters to make
writing legible, readable and
aesthetic. After choosing
Comic Sans as your font, you strive to make your
PowerPoint slides into works of
art by adjusting
word placement,
line lengths,
indentation, and
line-spacing. However, the
topic of this article is
topography, the study of
surface features usually applied to
land surfaces; i.e., the
lay of the land.
This is not a pipe, but the font is Comic Sans. Comic Sans is a typeface released by Microsoft in 1994 with Windows 95. It was inspired by comic book lettering, but its widespread use for formal communication has been widely ridiculed. When the Higgs boson was discovered at CERN in July 2012, one of the presentations of Higgs data used Comic Sans.[1] CERN must have liked the publicity, since it announced on April 1, 2014 (April Fools' Day), that it would use the Comic Sans typeface in all its publications.[2] (Modified Wikimedia Commons image by Torsten Bätge.)
In the dark days BC (
Before Computers) I would enjoy
exploring the older
archives at our
public library. Many of the
books I would sample were
large format tomes with
ornate leather binding that
screamed, "This stuff is important, since we spared no
expense in its presentation." It was there that I discovered that viewing book
pages obliquely revealed
rivers of
whitespace, the blank
area between
words, that
flowed down the page. This effect was likely accentuated by somewhat larger spacing between words from the demands of
justification in stretching lines to fill the
horizontal of a page.
I've carried this
idea with me through five
decades, a period in which computers and
computer graphics have advanced considerably. It's come to the point at which generating aesthetic images based on this idea of text topology has become very easy. The term,
computer art, covers a wide range of topics, so a better term for text topology art might be
low-complexity art. Naming these
abstract art images is easy, since they're based on particular texts, and the title or
theme of the texts supplies the name.
Generating such art is aided by the fact that files of
public domain classics exist on the
Internet in places such as the
Internet_Archive and
Project Gutenberg. Such free and open resources should be encouraged in an Internet age in which everyone
seeks to perpetually monetize their intellectual property, and I've donated to both of these organizations.
Johannes Gutenberg (c.1400-1468) and Tim Berners-Lee (b. 1955). Although not the inventor of movable type, Gutenberg was the first European to use it, thereby creating a way to make books less expensive and available to more people. Berners-Lee, a computer scientist at CERN, was the originator of the Internet's World Wide Web, thereby making vast quantities of information instantly available. (Left, a Wikimedia Commons image of Johannes Gutenberg. Right, portion of a Wikimedia Commons image by Paul Clarke of Tim Berners-Lee at the November, 2014, Open Data Institute Summit. Click for larger image.)
I created a
PHP program to transform text files into images, and these images can be subsequently modified by an
image manipulation program, such as the
GNU Image Manipulation Program, to add
color and other effects. A
zip file containing the PHP
source code and some example text files can be found
here. The input text file should be
ASCII text, but a
Linux utility can convert the common
UTF-8 files to ASCII, as follows:
uni2ascii -e input.txt >output.txt
The intended text file is read as a long
string, and
regular expressions are used to replace whitespace and letter characters with ones and zeros to create an image string. There are checks along the way to ensure that the regular expression
patterns hadn't missed anything. An output image file is opened, and a
header is written for a
portable bitmap format (*.pbm) file, a type of file that's just black and white
pixels represented by ones and zeros. The
vertical axis is stretched by a factor of two to give a more aesthetic image, and the *.pbm file is written. The file is crude at this point, as the following image illustrates.
Portion of a *.pbm image created by the text topography program. This basic image is subsequently processed to add color and other image effects to produce a finished artwork. One cause of an invalid *.pbm file is writing characters other than zeros and ones. My text topography program tries to prevent such errors, but some texts (Moby Dick) needed manual editing to ensure ASCII encoding before they would work. The program flags most errors.
Left, image processed output for Pliny's Natural History, book XXXVII, chapters 1-6, in English translation.[3] Right, image processed and rotated output for Homer's Iliad, book I, in English translation.[4] Edge detection was used to create this effect. (Click for larger image.)
References:
- Patrick Kingsley, "Higgs boson and Comic Sans: the perfect fusion," The Guardian. July 4, 2012.
- Cian O'Luanaigh, "CERN to switch to Comic Sans," CERN Website, April 1, 2014.
- Pliny the Elder, "The Natural History," John Bostock, Trans., Taylor and Francis (London: 1855), via the Tufts University Perseus Digital Library Project.
- Homer, "Iliad," Samuel Butler, Trans., via The Internet Classics Archive .