Raw Dataset#
BLISS will register dataset with ICAT automatically. If there are datasets
that are not confirmed to be received by ICAT when switching to a different
investigation, BLISS will save the data metadata in a folder called __icat__
.
To register those datasets
icat-store-from-file /data/visitor/.../RAW_DATA/__icat__/*.xml
ICAT Registration#
In case you need to register a directory which is neither in ICAT nor in __icat__
icat-store-raw --beamline id00 \
--proposal id002207 \
--path /data/visitor/.../RAW_DATA/collection/collection_dataset1 \
--dataset dataset1 \
--sample mysample \
-p FIELD1=value1 \
-p FIELD2=value2
The equivalent in python (metadata is optional)
from pyicat_plus.client.main import IcatClient
from pyicat_plus.client import defaults
client = IcatClient(metadata_urls=args.metadata_urls)
metadata = {"FIELD1": "value1", "FIELD2": "value2"}
client.store_dataset(
beamline="id00",
proposal="id002207",
dataset=,
path="/data/visitor/.../RAW_DATA/collection/collection_dataset1",
metadata=metadata,
)
client.disconnect()
ICAT Synchronization#
Warning
This requires expert knowledge and is normally done by ICAT admins.
The raw data on disk of one of more investigations can be synchronized with ICAT as follows
icat-sync-raw --beamline id27 --proposal blc14904 --session 20230829 \
--save-dir /tmp/icat/summary --cache-dir /tmp/icat/cache \
--format esrfv3 --register --invalidate-cache
–save-dir: generate CSV files and bash scripts to resolves issues later
–cache-dir: store session information in JSON files
–register: ask for ICAT registration when a session is not properly registered with ICAT and removes the session from the cache when answering “yes”
–auto-register: same as –register but then without prompting for validation and only for datasets that are safe to register unsupervised
To update the cache periodically, this command is appropriate
icat-sync-raw --save-dir /tmp/icat/summary --cache-dir /tmp/icat/cache \
--invalidate-cache --no-print
–invalidate-cache: remove session from the cache that no longer exist on disk or have changed
–no-print: do not print a summary for each session