Newsletter sign-up
View all newsletters

Enterprise Java Newsletter
Stay up to date on the latest tutorials and Java community news posted on JavaWorld

Sponsored Links

Optimize with a SATA RAID Storage Solution
Range of capacities as low as $1250 per TB. Ideal if you currently rely on servers/disks/JBODs

Big data publishing gets the royal treatment

The MarkLogic NoSQL database opens Royal Society of Chemistry to public view

  • Print
  • Feedback

The Royal Society of Chemistry (RSC) in the United Kingdom is Europe's largest organization for advancing the chemical sciences. The RSC is also 170 years old -- so it's no surprise the assets it's accumulated since the 1840s would be unwieldy to manage, publish, and otherwise make more broadly available. The recent explosive growth of digital assets has only exacerbated the problem.

A NoSQL database from MarkLogic offered the solution RSC was looking for, unlocking a treasure trove of assets and enabling the RSC to publish three times as many journals and four times as many articles. It also gave the Society the ability to develop new educational applications to make chemistry accessible to a wider audience.

[ Download InfoWorld's Big Data Analytics Deep Dive for a comprehensive, practical overview of this booming field. | Harness the power of Hadoop with InfoWorld's 7 top tools for taming big data. ]

Supported by an international publishing business and worldwide members, the RSC's activities span education, conferences, science policy, and the promotion of chemistry to the public. Its history is rooted in a combination of societies that were integrated as one in 1980: The Chemical Society, The Society for Analytical Chemistry, The Royal Institute of Chemistry, and The Faraday Society. The accumulated content includes more than 1 million images, millions of science data files, and hundreds of thousands of articles from more than 200,000 authors. On top of that, add the recent capture of social media, video, and other digital content.

Giving big data publishing the royal treatment
Searching RSC publications with MarkLogic

RSC determined the MarkLogic document database was the right solution to create one integrated repository -- and make it easily accessible to anyone online, from entrepreneurs to researchers to educators around the world. The key to MarkLogic is in how it stores content as XML documents: Information that should not or cannot be expressed in a straightforward fashion as rows and columns -- such as contracts, manuals, books, emails, tweets, and metadata -- is well suited to MarkLogic's XML-based, document-centric model.

David Leeming, manager at the projects office for the RSC commented, "A book chapter is very different to a journal article whereas in a relational model you couldn't work that out, and you couldn't put those two together. We can just fill out all our XML into MarkLogic and actually then bring it out together as a single integrated delivery mechanism."

A famous trait among many NoSQL systems is their schema-less nature, meaning the database's metadata does not have to be rigid in order to build the application -- a standard requirement for applications built on relational databases. With MarkLogic, information can be loaded as is, which is especially efficient for indexing and querying information with poorly defined, changing, and/or unknowable schemas.

Each piece of content is automatically tagged, which allows users to discover content quickly and understand the context around it, connecting the dots between different pieces of research, video, journal articles, or images. RSC's platform has also added new applications for children, journals for researchers, social features, and mobile capabilities all powered by MarkLogic.


  • Print
  • Feedback