Measuring programming language popularity
It is difficult to determine which programming languages are most widely used, and what usage means varies by context. One language may occupy the greater number of programmer hours, a different one have more lines of code, a third may utilize the most CPU time, and so on. Some languages are very popular for particular kinds of applications. For example, COBOL is still strong in the corporate data center, often on large mainframes; FORTRAN in engineering applications; C in embedded applications and operating systems; and other languages are regularly used to write many different kinds of applications.
Various methods of measuring language popularity, each subject to a different bias over what is measured, have been proposed:
- counting the number of times the language name is mentioned in web searches. (see Google Trends)
- the number of books sold that teach or describe the language[3]
- estimates of the number of existing lines of code written in the language—which may underestimate languages not often found in public searches[4]
- counts of language references (i.e., to the name of the language) found using a web search engine[5]
- counting the number of projects in that language on SourceForge and FreshMeat.[6]
Several indices have been published :
- The monthly TIOBE Programming Community Index has been published since 2001, and shows the top 10 languages' popularity graphically, the top 20 languages with a rating and delta, and the top 50 languages' ratings.[7] The numbers are based on searching the Web with certain phrases that include language names and counting the numbers of hits returned.
- The Language Popularity Index[8] is based on a similar approach, however in a transparent way: counts for all {search engine, language} pairs are published. An open source tool for grabbing counts from search engines is provided as well, so the rankings can be reproduced and verified. It does not show historical trends.
- The PYPL PopularitY of Programming Language[9] is based on Google Trends. It is thus based on what developers actually search on the web, instead of what pages are available. It shows the popularity trends since 2004.
- The "RedMonk Programming Language Rankings"[10] are derived from a correlation of programming traction on GitHub (usage) and Stack Overflow (discussion).
- The "Trendy Skills"[11] searches and extracts from popular advertising websites the skills and technologies that employers are looking and classifies skills sought in categories, which among them is the Programming Languages category. It allows the user to see the trends for one or more skills or categories at specified time ranges. Data is also accesible via a public API, so anyone can generate their own statistics.
References [edit]
- ^ Survey of Job advertisements mentioning a given language
- ^ http://jobstractor.com/monthly-stats
- ^ Counting programming languages by book sales
- ^ Bieman, J.M.; Murdock, V., Finding code on the World Wide Web: a preliminary investigation, Proceedings First IEEE International Workshop on Source Code Analysis and Manipulation, 2001
- ^ "Tiobe Index Definition". TIOBE Software. Retrieved April 10, 2012.
- ^ Eric S. Raymond, The Art of Unix Programming, Chapter 14. Languages, http://www.catb.org/~esr/writings/taoup/html/ch14s05.html
- ^ Tiobe Software Index
- ^ Language Popularity Index
- ^ PYPL PopularitY of Programming Language index
- ^ RedMonk Programming Language Rankings
- ^ Trendy Skills