FAIR Data and Semantic Publishing

Explicit, Executable, Reusable, and Automatically-Disseminated Scientific Publications in the form of FAIR Data

Projections suggest that the delay between scientific discovery, and the dissemination and implementation of the knowledge embodied in that discovery, will soon vanish. At that point, all knowledge resulting from an investigation will be instantly interpreted and disseminated, influencing other researcher's experiments, and their results, immediately and transparently. This clearly requires that research results be of extremely high quality and reliability, and that research processes – from hypothesis to publication – become tightly integrated into the Web. Though the technologies necessary to achieve this kind of “Web Science” do not yet exist, our recently-published studies of automated in silico investigation demonstrate that we are enticingly close, and a path toward next-generation Web Science is now clear.

FAIR Data - Findable, Accessible, Interoperable, and Reusable -

Our lab are lead participants in the FAIR Data initiative. In addition to being lead-authors of the FAIR Principles, we are also lead authors on the first end-to-end implementation of those principles over an agriculturally relevant data source, we are lead authors on a set of objective Metrics for measuring the FAIRness of a resource, and we are also the lead laboratory creating software capable of autonomously executing FAIR evaluations, and scoring resourcesbased on the level of FAIRness they have achieved. In addition, we are exploring how these principles can be used to make science more transparent. When data and knowledge is FAIR, it becomes easier to find, and therefore easier to validate against prior biological knowledge and data. We examine how FAIR publication of scientific assertions might be automatically compared to similar assertions in the scholarly literature, providing a means to both explorethe liklihood of truth of a given assertion, as well as provide a richer collection of citations, ensuring that all relevant scholars are properly credited.

On top of the natural complexity of biological data, the tools we use to analyze those data must also be FAIR. Our two technologies - SADI and SHARE - are the 'bridge' between FAIR and traditional bioinformatics analysis tools and pipelines.