Integrated Observatory acquires and ingests data from physical instrument.
|Actors||Marine Asset Operator|
|References||OOI/ION canonical data format|
|Is Used By|
|Is Extended By|
|In Acceptance Scenarios||AS.R2.01A Operate Marine Observatory, AS.R2.02A Cruise Support, AS.R2.02B Data Support via Cruise, AS.R2.02C Instrument Life Cycle Support|
|Primary Service||Data Acquisition Services|
|UC Status||Mapped + Ready|
This information summarizes the Use Case functionality.
The Integrated Observatory acquires data from the physical instrument, converts the data into the OOI canonical data format, associates appropriate metadata with them, and ingests (persists) the result. Acquisition may be performed repeatedly on a fixed schedule (interval may be measured/driven by instrument or Integrated Observatory system), or by user requesting each desired sample. Data may be buffered temporarily on remote platforms before arrival at the Integrated Observatory. In addition to persisting the data converted to the canonical form, unprocessed (raw) data from OOI instruments are always persisted.
- Depends on predefined instrument registrations (including metadata describing the instrument and its data formats) and instrument activation.
- Instrument may be capable of streamed or polled data collection, or both. A user request for a single observation will not be allowed to interfere with an existing configuration for streamed observations.
- Whether or not the instrument provides a way to acquire repeated observations, the system will support that capability for each instrument.
- Whether or not there is uninterrupted communications to the instrument, the system supports repeated or single observations. If ION is driving the sampling, out of choice or necessity, the driving software in ION (presumably the instrument driver) must have communication with the instrument that is sufficiently high frequency to take the samples with the desired timing precision.
- In the case of instruments that are intermittently disconnected from ION, data must be buffered. The explicit mechanism for buffering is not addressed here; however, it is assumed the buffering is managed by the system software collocated with the instrument (whether ION or other entity), and does not require specific steps by the instrument driver/agent to retrieve the data. (That is, they will arrive at the ION system in the same way as data that are not buffered arrive.)
- This use case addresses those situations where new data are explicitly desired from the instrument. If the user wants prior data from an instrument, this is simply a request to view data, not acquire them.
- This use case does not address the production of Level 1 data from Level 0 data.
- To the extent data from different data sets are packed into a structured collection before acquisition by ION, this scenario and good practice assumes that the packaging will be decomposed by a suitable dataset agent (which can translate the packing structure into appropriate metadata).
The instrument resource is activated and registered, but has not been acquired for use.
- The Marine Asset Operator requests the Integrated Observatory collect the desired data (optionally repeatedly) from the instrument, output in the desired format.
- For some instruments (e.g., GPS), repeated observations are the default or only configuration; others (e.g., SeaBird CTD) require more elaborate configuration and setup to define sampling options and periodicity; and others can not support repeated sampling at all.
- If the instrument reports its data to a collection server (e.g., on shore) before the Integrated Observatory sees the data, the server's role is addressed here, rather than the instrument (i.e., communications happen with the server, instead of the instrument). The next two steps here can be bypassed in that situation.
- If an instrument's data is collected on intermediate media, such as USB storage (either in the instrument or its parent platform), the Integrated Observatory will provide software that acts as a collection service; see fourth step below.
- For instruments that can be set to repeatedly sample, the Marine Asset Operator may be given the choice between instrument-driven (push) sampling (the instrument keeps the clock and takes the samples autonomously) and ION-driven (polled) sampling (ION keeps the clock and requests a new sample at each interval).
- If the instrument does not support single sample acquisition (it does not have an option to be polled, or it is in a mode that precludes that), an error is returned to the request initiator.
- Some instruments provide multiple observing capabilities and/or data formatting capabilities; these are appropriately selected by the instrument operator based on input from the OOI science users.
- The Integrated Observatory system configures the instrument as needed to support the requested type of sampling.
- For some instruments, this configuration must be made in advance of enabling data collection.
- This step ensures exclusive access to prevent multiple processes from attempting to control the instrument simultaneously.
- The Integrated Observatory system configures the instrument (if so requested and supported), or its own operations, to perform the requested amount of sampling: once only, or repeatedly.
- If necessary, this includes enabling data taking in the instrument, at which point the instrument begins making and reporting observations. This is a state change of the instrument that should be reflected in ION state information that are visible to the user.
- If necessary, the ION software sets up its own execution of instrument polling, to provide repeated observations that are driven by ION.
- Acquiring data may take on the order of minutes, or even hours, on some devices or for some observation requests. Timeouts should be set appropriately for each device/observation.
- If the instrument's data is available through intermediate media, the Data Operator interacts as needed with the Integrated Observatory to set up the transfer.
- This step supports scenarios where data is offloaded from instruments or platforms via physical media or shipboard processes, and then must be uploaded into the Integrated Observatory.
- In the simplest case, the data just appears in a location already monitored by the Integrated Observatory, and the data is fully self-describing. No Data Operator action is needed.
- In a simple variant of that, the Data Operator only needs to copy appropriately self-describing data to the appropriate location; the Integrated Observatory detects it at that location.
- For data that is not sufficiently self-describing, the Data Operator must specify to the Integrated Observatory the characteristics of the data that can be found at the particular location. The Integrated Observatory may query for information it needs to understand the data being acquired.
- The Integrated Observatory receives (or obtains, if pulled from a server) data from the instrument.
- The driver ensures persistence of raw data (e.g. writes them to platform disk storage or collects them for subsequent publishing as messages)
- The driver is prepared to separate and timestamp any data record output by the instrument.
- Initial parsing is limited to determining when instrument records begin and end. An instrument record is an atomic entity for ION: it has associated timestamps and metadata. Even if there is no pause between instrument records, ION needs to be able to separate them during the ingest process. (Therefore, ION must know what the record separator is for the instrument, or be able to derive record boundaries in some other way.)
- Either for the one record, or repeatedly for each record received, ION receives, timestamps, and parses data from the instrument.
- For data records received, the Integrated Observatory system parses the content of the record and produces a message containing the record in ION canonical data form.
- The driver may collect multiple data records in one message
- The ION system produces a parsed or structured form of the data in a format specific to the record type, which is automatically pushed into the appropriate ION message queue. The exact form of the resulting parsed/structured data is a matter of system design and the type of each value within the record: individual values may be unaltered from the original though indexed within that structure, or may be copied as is into another structure, or may be transformed into a different representation.
- This may be driven via messaging with the raw data message arrival, or within the driver/agent (TBD by the architecture).
- Each data message is published to the ION exchange with appropriate metadata.
- As appropriate, the records are associated with the appropriate data set to enable a contiguous set of data records.
- This typically involves forward and back-linking the record and its preceding neighbor (by record receipt time in ION, and, if different due to buffering, by data generation time).
- For some data types, it may also involve linking to other types of 'neighbors', for example related observations in a gridded array or swath.
- If the original request was from a process requesting a single sample, the resulting sample is returned to the requestor in the desired format (instrument byte format, or (via ION common data model) other representations).
- The original requestor may have been a user, but this use case ends with the data being provided to the corresponding system-internal process.
- The model for implementing this may be identical to receiving streamed data: subscribe to the resulting data set.
Data have been acquired from the instrument, converted into the Integrated Observatory's canonical data model, and made available as predefined L0 data product where possible.