"The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve. We should be grateful for it and hope that it will remain valid in future research and that it will extend, for better or for worse, to our pleasure, even though perhaps also to our bafflement, to wide branches of learning."[1]Many non-science disciplines have adopted some principles of mathematics, since they're so useful. Statistics, of course, is preeminent, but some researchers have applied other mathematical techniques that are common in the sciences. This is illustrated in a recent paper by Maurizio Serva of the Dipartimento di Matematica, Università dell'Aquila (L'Aquila, Italy) that attempts to uncover the degree of relationship between languages.[2] The ultimate objective of this paper, and similar papers on this topic, is the explication of language phylogeny; that is, the evolutionary path of language and development of a language "family tree." Comparative linguistics has been an active topic for many years, and students who have taken a foreign language course have noticed similarities between words in their native language and the one they're learning. Languages in particular language families, such as the Romance languages, have many similar words. The common examples are words relating to "mother;" for example, we have mater (mother) in Latin, and the corresponding maternal in English. Going farther back, it seems that the "ma" vocalization has meant mother since the dawn of humanity. The usual technique for language comparison is to examine the shared cognates between languages, something that I wrote about in relation to the Ugaritic language in a previous article (Ugaritic, July 16, 2010). Cognates are words in different languages that have a common etymological origin. For example, the English, "silver," and German, "silber;" or the Latin "argentum," French, "argent," and Italian, "argento."
This might look like mathematical notation, but it's actually the word, Inuktitut, in Canadian Aboriginal syllabics. (Via Wikimedia Commons). |
d(ω1,ω2) = dL(ω1,ω2)/l(ω1,ω2),where ω1 and ω2 are the two words that are being compared, dL is the Levenshtein distance, and l is the length of the longest word of the two being compared. The normalized lexical distance will have values between zero and one. So, what's Serva's method of language comparison? It's easily understood by computer and physical scientists. You construct a list of words with the same meaning in each language (Serva takes this number, M, to be a reasonable 200), and then do a normalized sum over all these words. For example, comparing languages α and β, the lexical distance D between these would be