Nature.com ontologies portal available online


The Nature ontologies portal is new section of the nature.com site that describes our involvement with semantic technologies and also makes available to the wider public several models and datasets as RDF linked data.

We launched the portal nearly a month ago, to the purpose of sharing our experiences with semantic technologies and more generally to contribute to the wider linked data community with our data models and datasets.

Screen Shot 2015 04 30 at 17 35 39

This April 2015 release doubles the number and size of our published data models. This now spans more completely the various things that our world contains, from publication things – articles, figures, etc. – to classification things – article-types, subjects, etc. – and additional things used to manage our content publishing operation – assets, events, etc. Also included is a release page for the latest data release and a separate page for archival data releases.

Npg models hierarchy v2 alt

Background

Is this the first time you've heard about semantic web and ontologies?   Then you should know that even though internally at Macmillan Science and Education XML remains the main technology used to represent and store the things we publish, the metadata about these documents (e.g. publication details, subject categories etc..) are normally encoded also using a more abstract, graph-oriented information model.   This is called RDF and has two key characteristics: - it encodes all information in the form of triples e.g. - it was built with the web in mind: broadly speaking, each of the items in a triple can be accessed via the internet i.e. it is a URIs (a generalised notion of a URL).   So why using RDF?

The RDF model makes it easier to maintain a shared yet scalable schema (aka an 'ontology') of the data types in use within our organization . A bit like a common language which is spoken by increasingly more data stores and thus allows to join things up more easily whenever needed.   At the same time - since the RDF model is native to the web - it facilitates the 'semantic' integration of our data with the increasing number of other organisations that publish their data using compatible models.   For example the BBC, Elsevier or more recently Springer  are among the many organisations that contribute to the Linked Data Cloud.

What's next

We'll continue improving these ontologies and releasing new ones as they are created. But probably most interestingly for many people, we're working a new release of the whole NPG articles dataset (~1M articles).

So stay tuned for more!

Cite this blog post:


Michele Pasin. Nature.com ontologies portal available online. Blog post on www.michelepasin.org. Published on April 30, 2015.

Comments via Github:


See also:

2016


paper  Insights into Nature’s Data Publishing Portal

The Semantic Puzzle (online interview), Apr 2016.




2014


paper  Moving Early Modern Theatre Online: The Records of Early English Drama introduces the Early Modern London Theatres Website

New Technologies and Renaissance Studies II, ed. Tassie Gniady and others, Medieval and Renaissance Texts and Studies Series (Iter Academic Press), Dec 2014. Volume 4



paper  Factoid-based Prosopography and Computer Ontologies: towards an integrated approach

Digital Scholarship in the Humanities, Dec 2014. doi: 10.1093/llc/fqt037