Well, we may not be the first but wanted anyway to report that Nature has now embedded metadata (HTML meta tags) into all its newly published pages including full text, abstracts and landing pages (all bar four titles which are currently being worked on). Metadata coverage extends back through the Nature archives (and depth of coverage varies depending on title). This conforms to the W3Câs Guideline 13.2 in the Web Content Accessibility Guidelines 1.0 which exhorts content publishers to âprovide metadata to add semantic information to pages and sitesâ.
Metadata is provided in both DC and PRISM formats as well as in Googleâs own bespoke metadata format. This generally follows the DCMI recommendation âExpressing Dublin Core metadata using HTML/XHTML meta and link elements, and the earlier RFC 2731 âEncoding Dublin Core Metadata in HTMLâ. (Note that schema name is normalized to lowercase.) Some notes:
- The DOI is included in the â
dc.identifier
â term in URI form which is the Crossref recommendation for citing DOI.- We could consider adding also â
prism.doi
â for disclosing the native DOI form. This requires the PRISM namespace declaration to be bumped to v2.0. We might consider synchronizing this change with our RSS feeds which are currently pegged at v1.2, although note that the RSS module mod_prism currently applies only to PRISM v1.2.- We could then also add in a â
prism.url
â term to link back (through the DOI proxy server) to the content site. The namespace issue listed above still holds.- The â
citation_
â terms are not anchored in any published namespace which does make this term set problematic in application reuse. It would be useful to be able to reference a namespace (e.g. ârel="schema.gs" href="..."
â) for these terms and to cite them as e.g. âgs.citation_title
â.
The HTML metadata sets from an example landing page are presented below.