Martin Robbins bills himself and has for several years as the ‘Lay Scientist’ for his posts on the Guardian’s science blogs. Although you’d need to change need to change the grammar in Robbins’ online handle for it to be taken that way, it amuses me to note that ‘lay’ can be used as slang for sex or a sex partner in US and Canadian English; I’m not sure about UK English slang. This comment is tangentially relevant only because of Robbins’ April 30, 2015 posting titled : Porn data: visualising fetish space ,
Porn is one of the biggest yet worst-covered topics in popular discourse. It’s a multi-billion dollar industry that sits at the heart of human sexuality in the 21st century. Many people watch it, though few talk about it, and for better or worse it exerts a major influence over our culture; but we know relatively little about it.
What if we could find some big source of data? The website Clips4Sale.com is one of the leading commercial porn sites on the web. It’s home to thousands of studio selling millions of clips. All the clips are indexed with metadata about their price, file size, fetish category, length, title, description and so on, and the site’s permissive robots policy allows web crawlers to trawl the content. How much useful information could you dig out from it? What interesting things could you find?
The other weekend I wrote a script to find out. It crawled the site gathering data on 4,814,732 clips, which is rather a lot of porn and probably means I’m on some BT [formerly British Telecommunications] blacklist now. The earliest clips date back to late 2003, which makes the Clips4Sale corpus a 12-year history of paid porn on the Internet. Each month’s data is like a ring in a tree trunk, telling us what the market was like at that time. It’s not perfect – older clips may have disappeared or been deleted – but it’s enough to give us a rough picture.
…
… There’s really only one concrete conclusion I want to make, and it’s this: there is a vast ocean of data on the web about human sexuality, far more than I think people realise, and it could be an enormously valuable tool in developing our understanding of a really important topic.
In the space of a couple of weekends I was able to cobble together some code and come up with some interesting findings. A good researcher in the field can do far better, and I hope they do. Watch this space…
I encourage you to read Robbins’ piece in its entirety as he has provided some fascinating detail along with some data visualizations about sexuality in the UK, so far as it can be intuited from the porn metadata of one company.
Anyway, the ‘lay scientist’ situation (more correctly, it would be ‘laid scientist’) put me mind of a Feb. 6, 2015 piece about a periodic table of sexual terminology by John Brownlee for Fast Company,
First created by Russian chemist Dmitri Mendeleev in 1869, the Periodic Table of the Elements are ordered by their atomic number. It starts with hydrogen, and then the elements continue to get denser until they’re downright radioactive. Then there’s the Periodic Table of Sexual Terminology. It starts with a BJ—although let’s face it, that should probably be an HJ—and from there, the petting just gets heavier and heavier until it, too, is sexually radioactive.
The Periodic Table of Sexual Terminology is designed by Dorothy, the London-based design company …
That’s enough sexy stuff for today.