Distribution of title lengths in Wikipedias

This site offers a chart of the lengths of title in Wikipedia.

The first chart (titled *) shows the sum over all Wikipedias. The others are titled with the language code for the given site (i.e. de is for de.wikipedia.org). Next to the title you see the number of titles, and the 90% and 98% percentiles, i.e. stating that 90% resp. 98% of all titles are shorter than the given length.

We created these charts as they helped us with some design decisions in Wikidata. It is always nice to make design decision based on data :)

There is also a version showing the length in bytes, which was a bug early, but I found the charts looked funny so I kept them.

Acknowledgements

The code to create the page is available under an open source license. The charts are created using gRaphaël, a free JavaScript graphing library written by Dmitry Baranovskiy. The data the charts are based on are from the Wikimedia dumps provided by the Wikimedia Foundation.

Links

Top ten languages
Full list of languages (will load a while and has a long-running Javascript)
Length of titles in bytes (will load a while and has a long-running Javascript)

Written by Denny Vrandečić, 10 May 2012