Scientific data collected with modern sensors or dedicated detectors exceed very often the perimeter of the initial scientific design. These data are obtained more and more frequently with large material and human efforts.

A large class of scientific experiments are in fact unique because of their large scale, with very small chances to be repeated or superseded by new experiments in the same domain: for instance high energy physics and astrophysics experiments involve multi-annual and even multi-decades developments, unlikely repeatable. Other scientific experiments are in fact unique by nature: earth science, medical sciences etc. since the collected data is “time-stamped” and thereby non-reproducible by new experiments or observations.

This new knowledge obtained using these data (“data observatories”) should be preserved long term such that the access and the re-use are made possible and lead to an enhancement of the initial investment. It is therefore of outmost importance to pursue a coherent and vigorous approach to preserve the scientific data at long term.

The preservation remains nevertheless a challlenge due to the complexity of the data structure, the fragility of the custom-made software environments as well as the lack of rigorous approaches in workflows and algorithms.