NOMAD Laboratory

Pillar B: Experimental materials science

Spokesperson: Christof Wöll (Karlsruhe Institute of Technology)
Deputy: Christoph Koch (Humboldt-Universität Berlin)

Future information-based material development will be based on increasingly complex multidimensional and multiscale data, which will be obtained with the help of various characterization methods and must be made available in a knowledge-oriented manner. Only in this way, it will be possible to create scale and process-spanning models of materials science systems. Today, material data are described by differently structured data sets, for which only insufficient and equally heterogeneous, individual metadata sets are provided. This prevents the generic use of the data sets for future material developments and severely restricts the use of new techniques for data analysis such as data mining or machine learning. Only with data platforms that overcome these shortcomings it will be possible to create predictive material models with which it will be possible to predict material behavior under different operating conditions. Using system simulations or digital tools based on these models, it is possible to reduce production costs and conserve resources.

Materials are characterized by a multitude of different experimental probes and in many sites worldwide. Today, these provide different, non-correlated data sets, which are based on different metadata sets due to the diversity of users and their research objectives, and which therefore do not relate to each other at present. For this reason, the data are of interest to the individual user and can be evaluated according to an individual research question, but further use of the data is severely limited. With FAIRDI, we aim at the generation of generally accessible data archived according to strict quality rules.

In order to be able to use the data reliably, comprehensive information about its origin, about the history of the material and about influencing variables during its creation must be linked in the form of metadata in uniform formats or schemata. Major efficiency gains could be achieved if material data from different domains were available and if cross-domain, ontology-based structuring and metadata schemes were available that enable sustainable access and use of the data.

Material data are used in product design and testing and are therefore particularly sensitive to reliable and verifiable generation. Curating and quality assurance are therefore of the greatest practical importance in materials science for the analysis, evaluation and optimisation of materials, manufacturing processes and components. They are a prerequisite for the creation of material models and a necessity in order to considerably accelerate developments in materials science and materials engineering through virtual material development. For this purpose, a holistic approach with an open, connectable basis and domain-specific forms must be pursued. The data generation must be transparent, comprehensible and repeatable so that a quality assurance, traceability or reusability of the data becomes possible.

In this effort we aim at the generation of data sets generated to be stored in a congruent form with meaningful metadata on a data platform. The following points will be considered:

  • Data retrieval and congruent data storage
  • Creation and storage of metadata
  • Analysis of the value of the data
  • Data provision in a repository, incl. efficient searchability according to certain search criteria
  • Linking data sets from different characterization methods

On the one hand, this requires a definition of suitable metadata schemata that are as technology-independent as possible and a strategy for obtaining metadata. Today's predominantly manual input requires tools and aids to simplify input and make it error-tolerant. Possibilities for automated metadata generation with regard to real-time analyses must be investigated. On the other hand, existing technologies and tools (software tools) must be adapted or developed in order to ensure knowledge-based congruent data storage and storage of the associated metadata, possibly reduced to the core information, so that later use can also be guaranteed. This presupposes a continuous efficiency (bandwidth, latency) of the data platform from the generation of the data up to the storage, which must be guaranteed with the development. The necessary analysis of the value of the data with regard to a correlative characterization requires the establishment of suitable algorithms which not only include the data but also the metadata in the evaluation.