Encyclopedia > Wikipedia:Size comparisons

  Article Content

Wikipedia:Size comparisons

This article attempts to compare the size of Wikipedia with other encyclopedias and information collections.

See Wikipedia:Size of Wikipedia and Wikipedia:Statistics for estimates of Wikipedia's article count and article statistics, from which the following snapshot was taken:

Snapshots of Wikipedia's size:

  • (This combines different measures taken on different days in September, and assumes the average word to be 5 letters and a space -- beware!) As of September 2002, Wikipedia had approximately 42,000 'articles', using very crude criteria for what constitutes an article. Of those, perhaps half were "encyclopedia size" articles. The mean article size was about 1997 bytes, or roughly 332 words: the median article size was smaller, at roughly 980 bytes, or roughly 163 words. Combining the mean article size with the article count gives a very approximate character count of 83.9 megabytes, or 14 million words.

So, by estimated word count as of September 2002, Wikipedia is a quarter of the size of Britannica 2002, and by "encylopedia adjusted" article count it is also about a quarter of the size. However, Wikipedia has already half the number of topics of Britannica, measured by raw topic count.

Update: as of early March 2003, Wikipedia has roughly 108000 articles. Of those, perhaps 36,000 are "data dumped" gazeteer entries about towns and cities in the USA. Ignoring these for the moment, and assuming that the mean article size is still the same, this means that there were at that time approximately 72,000 non-gazeteer articles of an estimated average of 332 words, or 23.9 million words, roughly half the size of the Encyclopędia Britannica's 2002 edition

Not bad for an encyclopedia which is only two years old. But we must do better! Many of the articles are still of poor quality. As the Wikipedia grows more comprehensive, efforts are expected to move more towards increasing the quality and scope of existing articles, rather than the creation of new articles. It is also anticipated that the Wikipedia may grow to include a global gazeteer as part of its function.

See Wikipedia:Modelling Wikipedia's growth for more educated guesses about the potential growth of Wikipedia.

Comparison figures:

  • The advertisements for Encyclopędia Britannica's 2002 edition proudly proclaim they have over 85,000 articles. A claimed word count of 55 million words, at an assumed average 5 letters per word and a space, gives an estimated character count of 330 million characters, or a crudely estimated mean article length of 3882 characters.
  • The Columbia Encyclopedia, Sixth Edition, is cited as having 51,000 articles and having 6.5 million words. Assuming an average word length of five characters, and allowing for one space character per word, this gives a mean article length of very roughly 765 characters per article for the Columbia Encyclopedia.
  • Microsoft's Encarta Encyclopedia 2002 is cited as having 26 million words.
  • Microsoft Encarta Deluxe 2002 is cited as having "over 60,000 articles, 10,000 historical archives, and over 40 million words".
  • Grolier Multimedia Encyclopedia Online claims 11 million words and 39,200 articles.
  • American Jurisprudence[?] 1nd ed. is an 83 vol. collection of American common law, 2nd ed. 231 volumes!

How many things are there to describe?

Sizes of other non-encyclopedia information collections, for comparison. Note that Wikipedia is neither a dictionary, nor a web index: these figures are just for order-of-magnitude comparison.



All Wikipedia text is available under the terms of the GNU Free Documentation License

 
  Search Encyclopedia

Search over one million articles, find something about almost anything!
 
 
  
  Featured Article
North Haven, New York

... of 82.3/km² (213.3/mi²). The racial makeup of the village is 98.38% White, 0.40% African American, 0.00% Native American, 0.67% Asian, 0.00% Pacific Islander, ...

 
 
 
This page was created in 40.6 ms