Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

First of all I do dislike blog posts which lack a comment section to ask questions, criticize or praize.

Second of all I do dislike texts on data which lack information on where the data comes from.

I can think of ways to mine present day play counts from Spotify (while not working there) but I wonder where did he get the daily counts from he used in the last chart. Any ideas?

Furthermore I doubt that Spotify is necessarily a good indicator on how songs are being perceived in the long run. Especially b/c there are local platform-specific attractor dynamics at play.



Sorry that there's so comments section.

The data is pretty clear in terms of source...Spotify in 2014...Billboard data via Whitburn.

The data was directly from one of Spotify's data partners.

Yea Spotify isn't a perfect indicator. This is the best proxy for present-day popularity that I can think of. I could have create an index that abstracted several data sources, but that would have killed the readability of the article.


> Sorry that there's so comments section.

Just switch it on ... it's your site, isn't it?

> The data is pretty clear in terms of source...Spotify in 2014

That's not the "source" that's just a value of the time dimension.

> one of Spotify's data partners

well, you could have given that information in the text - if you talk about data, you gotta talk about where you got the data from.

Nonetheless the statement is still pretty obscure. Who is that "partner" - is it a secret?

Why don't you just dump the data on GitHub?

> I could have create an index that abstracted several data sources, but that would have killed the readability of the article.

I'm not sure if that is the true reason why you chose not to do it - but if so, then it is necessary to be transparent with assumptions, abstractions and simplifications, right?


Yea, there's lots of other charts, sources, notes that I could have included in the piece. The challenge is that this is not an academic study – it's Internet catnip. All the things that make studies too dense and boring to read (i.e., assumptions, abstractions and simplifications) are purposefully excluded.

I know that this undermines the credibility of the article, but I'm optimized for readability and storytelling, not to build a full-proof argument for timelessness. There's a million rabbit-holes that I could have gone down to make a much more solid case, but I decided to present the data and let the reader draw conclusions (kinda like I did with the hip hop/vocab piece: http://poly-graph.com/vocabulary.html).

I also realize that one could argue that this is a terrible way to approach a writing/data-analysis project. Assumptions and simplifications are important to highlight. But I weighed the options and decided to focus on accessibility.

Happy to discuss the pros/cons of this further :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: