For want of a better term I’m going to be calling the company, Marvel. It is an entertainment conglomerate based in the world of comic books and comic book heroes and the company has a data problem as David Lumb points out in his Nov. 27, 2013 article for Fast Company (Note: A link has been removed),
After 70 years of publishing, Marvel Entertainment has built up an incredible universe of heroes, villains, and super teams–a sea of data that no mere wiki can organize. At long last, Marvel has embarked on a mighty quest of its own: to create an entirely new graph database and search system to conquer continuity malaise by visualizing each character across the Marvel Universe.
Here’s how Lumb describes the problem,
The problem, like with any massive chunk of data, lies in getting the right data pieces in front of users–but for Marvel, the question becomes a semantic exercise. Just who is the character Hawkeye?
Well, he’s Clint Barton, except when he’s not; erstwhile sidekick Kate Bishop and villain Bullseye have taken the Hawkeye identity. He’s a member of the Avengers, except when he’s not; he’s also been part of the Thunderbolts and West Coast Avengers. He got his skills performing trick shots in a circus, except when he didn’t: He got them as an agent redeeming a murder conviction in the Ultimate universe and as a Black Ops SHIELD agent in the Marvel films.
According to the article, Peter Olson, Marvel’s VP of Web and Application Development, is in charge of developing the new database (Note: A link has been removed),
“We want an uberframework–the words ‘ontology’ and ‘taxonomy’ get thrown around a lot,” Olson said. “We want characters to appear as close to as possible from all their stories and iterations but, overall, we want the characters to bubble up to archetypes.”
Most databases are relational, most easily visualized as tables of rows and columns: … Marvel still has use for this kind of database that returns queries with solid, irrefutable answers, like listing all the issues they’ve sold for the above prices.
The new database, however, will run on graph theory, looking for relationships between characters, teams, and events. The graph above [image removed] displays relationships between characters, which would be extremely difficult for a relational database that might look for superheroes but leave out villains instead of showing more abstract values, like how popular/visible a character is across Marvel’s comic titles.
There is of course a business case for this new approach (Note: Links have been removed from the article excerpt),
… The Marvel graph database will find an answer based not only on book similarities but nuanced metadata, like writer or artist style. Better still, it’ll do what the venerable ComicBookDatabase cannot: confidently propose a list of essential story arcs for the new fan.
And lo, Marvel’s multimedia empire strikes again: Aside from Sony’s death grip on Spider-Man, Marvel holds the rights to all its major characters, so their recommendations aren’t limited to subscriber-only comics on its Marvel Unlimited service. Let loose the hounds of suggested merchandise! Of course, this also means those ultra-streamlined character pages will become the most seamless portals to every character’s stories that the Internet has ever seen.
This is a good article, which I recommend reading in its entirety, although I do have one suggestion for David Lumb and/or his editor,
… the company now depicts the same characters across multiple mediums … [emphasis mine]
The plural in this context (mass communications) should have been media. Mediums are a group of people who communicate with the dead, as one of my professors informed us all in an undergraduate communications course. He was a bit of hardliner on the topic,. I found out later there is a group that uses both; artists use either media or mediums (Grammarist).
Graph Databases, are covered in a Wikipedia essay, which has this to say about them (Note: Links and footnotes have been removed),
A graph database is a database that uses graph structures with nodes, edges, and properties to represent and store data. By definition,[according to whom?] a graph database is any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent element and no index lookups are necessary.
The essay also offers this illustration,Peter Olson, the VP who’s managing this new database for Marvel gave a talk at the Nov. 5 – 6, 2013 GraphConnect New York (City) meeting. His talk is described this way on the website (from the GraphConnect 2013 videos webpage),
Graphing the Marvel Universe – Peter Olson @GraphConnect NY 2013
This talk will give an overview of why graphs are such a powerful conceptual framework for modeling intellectual property and how Marvel uses them to represent the 70 years of fictional content from many different media that makes up the Marvel Universe.
The talk is approximately 40 mins. long and you can also find it here on Vimeo.
You can find more information (speakers, agenda, etc.) here about the Nov. 5 – 6, 2013 meeting in New York City and you can find out more about GraphConnect 2013 meetings in Boston, San Francisco, London (UK), and elsewhere by going here.