Tag Archives: ontology

Accessing data the Marvel (comic book) way: uberframeworks and graphs

For want of a better term I’m going to be calling the company, Marvel. It is an entertainment conglomerate based in the world of comic books and comic book heroes and the company has a data problem as David Lumb points out in his Nov. 27, 2013 article for Fast Company (Note: A link has been removed),

After 70 years of publishing, Marvel Entertainment has built up an incredible universe of heroes, villains, and super teams–a sea of data that no mere wiki can organize. At long last, Marvel has embarked on a mighty quest of its own: to create an entirely new graph database and search system to conquer continuity malaise by visualizing each character across the Marvel Universe.

Here’s how Lumb describes the problem,

The problem, like with any massive chunk of data, lies in getting the right data pieces in front of users–but for Marvel, the question becomes a semantic exercise. Just who is the character Hawkeye?

Well, he’s Clint Barton, except when he’s not; erstwhile sidekick Kate Bishop and villain Bullseye have taken the Hawkeye identity. He’s a member of the Avengers, except when he’s not; he’s also been part of the Thunderbolts and West Coast Avengers. He got his skills performing trick shots in a circus, except when he didn’t: He got them as an agent redeeming a murder conviction in the Ultimate universe and as a Black Ops SHIELD agent in the Marvel films.

According to the article, Peter Olson, Marvel’s VP of Web and Application Development, is in charge of developing the new database (Note: A link has been removed),

“We want an uberframework–the words ‘ontology’ and ‘taxonomy’ get thrown around a lot,” Olson said. “We want characters to appear as close to as possible from all their stories and iterations but, overall, we want the characters to bubble up to archetypes.”

Most databases are relational, most easily visualized as tables of rows and columns: … Marvel still has use for this kind of database that returns queries with solid, irrefutable answers, like listing all the issues they’ve sold for the above prices.

The new database, however, will run on graph theory, looking for relationships between characters, teams, and events. The graph above [image removed] displays relationships between characters, which would be extremely difficult for a relational database that might look for superheroes but leave out villains instead of showing more abstract values, like how popular/visible a character is across Marvel’s comic titles.

There is of course a business case for this new approach (Note: Links have been removed from the article excerpt),

… The Marvel graph database will find an answer based not only on book similarities but nuanced metadata, like writer or artist style. Better still, it’ll do what the venerable ComicBookDatabase cannot: confidently propose a list of essential story arcs for the new fan.
And lo, Marvel’s multimedia empire strikes again: Aside from Sony’s death grip on Spider-Man, Marvel holds the rights to all its major characters, so their recommendations aren’t limited to subscriber-only comics on its Marvel Unlimited service. Let loose the hounds of suggested merchandise! Of course, this also means those ultra-streamlined character pages will become the most seamless portals to every character’s stories that the Internet has ever seen.

This is a good article, which I recommend reading in its entirety, although I do have one suggestion for David Lumb and/or his editor,

… the company now depicts the same characters across multiple mediums … [emphasis mine]

The plural in this context (mass communications) should have been media. Mediums are a group of people who communicate with the dead, as one of my professors informed us all in an undergraduate communications course. He was a bit of hardliner on the topic,. I found out later there is a group that uses both; artists use either media or mediums (Grammarist).

Graph Databases, are covered in a Wikipedia essay, which has this to say about them (Note: Links and footnotes have been removed),

A graph database is a database that uses graph structures with nodes, edges, and properties to represent and store data. By definition,[according to whom?] a graph database is any storage system that provides index-free adjacency. This means that every element contains a direct pointer to its adjacent element and no index lookups are necessary.

The essay also offers this illustration,

[Downloaded from: https://en.wikipedia.org/wiki/Graph_database]

[Downloaded from: https://en.wikipedia.org/wiki/Graph_database]

Peter Olson, the VP who’s managing this new database for Marvel gave a talk at the Nov. 5 – 6, 2013 GraphConnect New York (City) meeting. His talk is described this way on the website (from the GraphConnect 2013 videos webpage),

Graphing the Marvel Universe – Peter Olson @GraphConnect NY 2013

This talk will give an overview of why graphs are such a powerful conceptual framework for modeling intellectual property and how Marvel uses them to represent the 70 years of fictional content from many different media that makes up the Marvel Universe.

The talk is approximately 40 mins. long and you can also find it here on Vimeo.

You can find more information (speakers, agenda, etc.) here about the Nov. 5 – 6, 2013 meeting in New York City and you can find out more about GraphConnect 2013 meetings in Boston, San Francisco, London (UK), and elsewhere by going here.

Reading multilingual science articles

I have longed to access science materials in languages I don’t read (and while I don’t think this particular tool is going satisfy that need), I am delighted to hear of a new tool that reaches across linguistic barriers to aid science understanding. From the Dec. 2, 2010 news item on Nanowerk,

A new set of tools released today by Science-Metrix Inc. seeks to improve the way we talk about and understand science – from the classroom to the boardroom. The US/Canada-based research evaluation firm has developed a new, multi-lingual classification of scientific journals, which is accompanied by an interactive web tool.

The interactive ‘Scientific Journals Ontology Explorer’ allows users to visualise the links between 175 scientific specialties in 18 languages, from Arabic to Swedish. The journal classification, which covers 15,000 peer-reviewed scientific journals, was translated by more than 22 international experts who volunteered their time and expertise, making the tools available to a worldwide audience.

This set of tools has applications beyond academia, such as for governments and firms tracking their performance in specific fields, as well as in science outreach and education.

“We hope this visualization tool will be used by teachers to show students how science spans a broad universe and how interlinked scientific research really is” says Eric Archambault, president of Science-Metrix. “By sharing this tool with the wider community, we also hope to foster discussion and research on the contemporary scientific system, and promote a greater understanding of science dynamics.”

This classification provides a timely representation of the structure of modern science by including not only the traditional areas of scientific inquiry, but also the more contemporary areas of research such as biotechnology and nanotechnology and enabling fields such as epidemiology.

I’d be willing to bet that Science-Metrix was founded either in Canada and/or by Canadians, from the About Us/Management page,

Éric Archambault, Ph.D.
President and CEO | Chief Executive Officer

Jean-François Bergeron, M.B.A.
COO and CFO | Chief Operation and Financial Officer

Frédéric Bertrand, M.Sc.
Vice-President, Evaluation

Grégoire Côté, B.Sc.
Vice-President, Bibliometrics

You’ll note the names and that all diacritic marks (accents) are included, plus one of the head offices is in Montréal, Québec.

As for the Scientific Journals Ontology Explorer, you can find it here.