What does the I in FAIR data stand for?
Welcome to my 5-minute non-technical explanation of Interoperability
Nothing could describe the joy on John ’s face as he found and accessed the datasets for his analysis. As he navigated the portal, smiles welled up in his eyes. Now, he can proceed with his analysis and complete his research work. The long search for relevant datasets for quite some time took a toll on his usual positive demeanour, but he shrugged it off as he had access. However, the datasets were scattered across different domains, and ensuring that they are combined and processed in a meaningful and helpful way is necessary.
The portal provided for download in a particular format unsuitable for machine actionability. Hence, John had to transform the datasets into a format useful to the software he would use. He also had to aggregate the data, which took time, and de-identify to ensure privacy was protected. These are some of the issues data enthusiasts face in discharging their duty, necessitating interoperability. This article will briefly shed light on the non-technical explanation of the interoperability component of the FAIR principle.
How does FAIR data help with this problem?
One of the significant aspects of the FAIR data principle is interoperability, which can be defined as the potential of data to be integrated with other datasets and systems. Interoperability is essential to unlocking the full potential of data. Integration of data with other datasets has immense benefits, and it allows for comprehensive analysis, which aids decision making; it enriches data, giving more depth and context for verification and ascertaining consistency, efficiency, streamlined process, improved data models, novel insights and discoveries, etc. Interoperability has also been characterised as one of the most complicated principles to implement amongst the FAIR data principles and the sub-principles are described in simple terms below.
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
Interoperability focuses more on metadata than data and gives comprehensive information and context about the data. This guideline emphasises using standardised and widely accepted languages within specific domains. This can offer clarity and ease of use across different contexts. For instance, the term resonance, which has various applications in introductory chemistry and physics, is different. Resonance in physics is a situation in which a vibrating body sets another body into vibration, both having the same natural frequency, while in chemistry, resonance is a way of describing bonding and structures in specific molecules and compounds. There is, therefore, the need to differentiate and adopt formal meaning and globally understood language for knowledge representation. The principal imperative is ensuring proper comprehension and limiting data representation ambiguity.
I2. (Meta)data use vocabularies that follow FAIR principles
Vocabulary means standardised terms used to describe data and metadata. When vocabulary is used, it will help ensure that people conform to using specific terms, which will ease sharing, understanding and integration. Using such unambiguous representation is not enough, and the vocabulary must conform to the FAIR data principles by being findable, accessible, interoperable, and reusable. To satisfy the findable criteria, they must be easily discovered by search engines or repositories. To be accessible and interoperable, they must be richly described to be accessible to machines and humans.
I3. (Meta)data include qualified references to other (meta)data
This specifies that the comprehensive description meta(data) about the data gives a reference or link to other data. When metadata is appropriately connected to other data, it provides context and sheds light on how different data relate. It also makes it easy to interpret and use the data correctly, facilitating the integration of various datasets, validation and traceability.
Summary
We live in an era when data exists in various domains in disparate forms. To make maximum use of these data, they must be integrated, which lends essence to the concept of interoperability. Interoperability is the hardest to implement among the FAIR data principles and is vital for good data management. This article has considered a non-technical explanation of interoperability by simply explaining the sub-principles.