0 votes
by

1 Answer

0 votes
by (460 points)

FAIR, the acronym for Findable, Accessible, Interoperable, Reusable [https://doi.org/10.1038/sdata.2016.18] has the overall goals of ensuring transparency, reproducibility, and reusability for scientific digital objects.

Data is required to train Machine Learning (ML) and Artificial Intelligence (AI) – and the more data you have, typically the better the ML / AI performance is(*). This requires you to find respective datasets, access it, integrate it with other datasets (interoperability), in order to reuse it for training. Unique identifiers (principle F1.) for reference, clear lineage (principle R1.2) for traceability, metadata for context and interpretation (principle R1.) and a usual way of access (principle A1.) etc. contribute to this.

(*) Very generic spoken, because: Your data foundation does not only need to be 'big', but you need representative and unbiased data as well.

I'm referring here to data in the narrow sense – in formats like tables, images, texts. Nevertheless, if you count AI as software algorithms, FAIR for research software supports you to provide and discover respective software algorithms as well.

Please keep in mind that FAIR itself does not contribute to Data Quality (or better said: Information Quality). One can even provide content-wise absolutely wrong data in a highly FAIR way.

The NFDI4Ing Q&A platform is here to empower researchers in the engineering sciences with a collaborative space to ask and answer questions about their research data management. Whether you're a seasoned expert or just starting out, this platform is designed to foster knowledge exchange and support your research journey.
NFDI4Ing is supported by DFG under project number 442146713
...