Tikalon Header Blog Logo

ArXiv at Twenty

August 24, 2011

The science and mathematics preprint server, arXiv, celebrated its twentieth anniversary on August 14, 2011.[1-2] It was started as an email reflector by Paul Ginsparg at Los Alamos National Laboratory to disseminate preprints of papers in high energy physics. The original server was Ginsparg's 25 MHz NeXTstation, which had, for the time, a phenomenal 105 Megabyte of hard drive data storage.[2]

As any scientist knows, publication of research in peer-reviewed, journals is a long, tedious process. A cornerstone of science is that your research must pass muster with your scientific peers. Journal editors need to send your manuscript for review by others in your field, and these peer reviewers decide what weak points in your manuscript need to be strengthened, or that your paper should be rejected.

Often, these rejections are not because of some fundamental error. They are merely because the particular journal is too prestigious for your somewhat ordinary paper. Most rejected papers are published in other journals that are lower in the prestige pecking order. I was fortunate early in my career to publish in one of the highly regarded physics journals,[3] but now that I'm retired, I really don't care where I'm published, as long as the paper is accessible to any who are interested. This blog might be the prime example of my philosophy.

Preprints serve the purpose of claiming priority in a research area long before the research is published. Sure, there's a little egotism involved, but it actually eliminates wasted effort by others who might contemplate doing exactly what you're doing. It also allows for peer review outside the two or three people chosen by the journal editor, and this could prevent considerable embarrassment when a published paper is contradicted by other evidence. I published a note that demolished a theory proposed in one such published paper. I would have happily just contacted the author before his publication if I had known about his work.[4]

Preprints were especially important in high energy physics. Participants in this field feed from the same data sources, so duplication of effort would be considerable if preprints didn't give notice of who is doing what. Preprints in high energy physics existed long before Ginsparg's email reflector. Before the Internet, preprints were sent as photocopies in the mail. Perhaps one reason for the near-demise of the US Postal Service is the removal of all these weighty tomes from the mail stream.

Ginsparg was one of the recipients of the MacArthur Foundation "Genius Grants" in 2002. I wrote about the 2010 Genius Class in a previous article (MacArthur Fellows 2010, October 5, 2010). The MacArthur foundation prefers the term, "MacArthur Fellow," to Genius. In any case, the MacArthur Fellows receive an unrestricted grant of half a million dollars. That might not be a lot of money for a CEO, but it's a lot of money for a physicist.

Paul Ginsparg and Richard Stallman

Beards required? Paul Ginsparg (left) and Richard Stallman (right). Richard Stallman was a 1990 MacArthur Fellow. (Via Wikimedia Commons, left image, right image). In the interest of full disclosure, I had a beard for one year when I was an undergraduate. Perhaps I shouldn't have shaved!


Ginsparg's email reflector evolved into an FTP server within a few months. He had estimated a potential database of about a hundred submissions per year, but there were 400 preprints posted in the first half year.[2] The original IP address was a subdomain of Los Alamos National Laboratory (xxx.lanl.gov), but arXiv got its arXiv.org domain name in 1998.[2] Of course, letter case is irrelevant in domain names, but the X is capitalized, both as a reminder that Don Knuth's TEX program was an integral part in the presentation of readable papers, and as a remnant of the original xxx.LANL.gov domain.[2] Archive.org had already been taken.

Ginsparg and ArXiv now reside at Cornell University, and Ginsparg will relinquish control of arXiv to Cornell's library staff in September.[1] ArXiv has had about 700,000 total submissions through mid-August, 2011. The breakdown by field of study can be seen in the figure.[1] ArXiv receives about 75,000 submissions per year, and it serves about a million full text downloads each week to about 400,000 distinct users. At least ten of those weekly downloads are to me.

Total submissions to arXiv through mid-August, 2011

Total submissions to arXiv for various fields through mid-August, 2011. (Graph via Gnumeric).


Ginsparg notes that the frequency of "problem submissions" is less than one percent, and they're concentrated in the expected areas.
•  General Relativity
•  Quantum Mechanics
•  Unified Theories in Physics
•  Proofs of the Riemann Hypothesis
•  Proofs of Goldbach's Conjecture
•  New Proofs of Fermat's Last Theorem
•  Proofs of P ≡ NP

I need to wrap up this article, since I'm working on my next submission to arXiv. It describes how my proof of Riemann's hypothesis has facilitated a novel proof of Fermat's last theorem. However, I will leave you with this anecdote from Ginsparg.[2] In the days when preprints were still photocopied for distribution, he helped Hans Bethe clear a paper jam in a photocopier one weekend. So, Bethe and Ginsparg were both working on a weekend. I understand, now, why they're both famous.

References:

  1. Paul Ginsparg, "ArXiv at 20," Nature, vol. 476, no. 7359 (August 11, 2011), pp. 145-147.
  2. Paul Ginsparg, "It was twenty years ago today ...," arXiv Preprint Server, August 14, 2011.
  3. P. Duffer, D.M. Gualtieri, and V.U.S. Rao, "Pronounced Isotope Effect in the Superconductivity of HfV2 Containing Hydrogen (Deuterium)," Phys. Rev. Lett., vol. 37, no. 21 (November 22, 1976), pp.1410-1413.
  4. I met the author unexpectedly at a meeting many years later. He was a very likable fellow.

Permanent Link to this article

Linked Keywords: Science; mathematics; preprint; arXiv; email reflector; Paul Ginsparg; Los Alamos National Laboratory; high energy physics; MHz; NeXTstation; Megabyte; hard drive; peer-review; journal; perpetual motion; pecking order; philosophy; priority; egotism; Internet; photocopier; US Postal Service; MacArthur Foundation; CEO; physicist; Richard Stallman; FTP server; IP address; subdomain; domain name; Donald Knuth; TEX; Archive.org; Cornell University; Gnumeric; frequency; pseudoscience; General Relativity; Quantum Mechanics; Physics; Riemann Hypothesis; Goldbach's Conjecture; Fermat's Last Theorem; P ≡ NP; Hans Bethe.