It'd be nice to have someone from GitHub comment on how these rankings are calculated. Raw LOC? Percentage of projects with at least one X language file in them? Percentage of projects that have a plurality of source code in X language?
I second this request. Been wondering more about this process ever since I noticed, sometime in January, that Rust (a nascent language with very little use yet) was ranked in the low-20s for language popularity. The problem likely had to do with how the Rust devs registered their language with Github's Linguist[1] project; Rust's units of compilation ("crates") have the .rc extension, and Github must have taken this to mean that every project with a .rc file (notably many Android-specific projects) contained Rust code. The Rust devs have since revoked that association from Linguist[2], but since Github takes a while to reevaluate a project's language breakdown, the language's popularity ranking on Github is still artificially inflated (sitting at #31 at the moment).
Note the language breakdown, under the Graphs tab.
But yes, it all hinges on how Github measures popularity. For one, I'm curious how forked repositories are counted; if I have a 100kloc Blub project and it gets forked 100 times, does Github count that as 100,000 or 10,000,000 lines of Blub?
defunkt has confirmed that at least in the past if you had a repo with 100 Perl files and one JavaScript file it would count equally towards both languages.