|This page describes visionary science and education scenarios for the OOI Integrated Observatory Network. For Release specific, feature-oriented use cases, please refer to the Release Product Descriptions and Acceptance Scenarios.|
- Science and Education Application Scenarios
- Overarching Scenario: Scientific Investigation
- Use Scenario 1: Large-scale ocean observatory with access to external data sources
- Use Scenario 2: Using numerical models to coordinate multi-resource observations
- Use Scenario 3: Interactive Control of a Remote Laboratory
- Use Scenario 4: Autonomous Control of Mobile Instrument Platforms
- Science Workflow Use Cases
- Configure OOI system to accept data products
- System performs publication when software runs
- Publishing data as a stream
- Data Stream Archival
- Publishing as publicly available data
- Publishing notifications
- Release-Specific Use Cases
This section provides a representative set of user application scenarios, based on direct user input received at the requirements and design workshops, and the system requirements. Some of the scenarios are grounded in the Concepts or Operation (see References CI-COP1, CI-COP2, CI-COP3).
In particular, the following scenarios are brought forward to illustrate the application design specified in this document:
- Numerical Modeling [CI-RWS2] Scenario I (4.5): "Test the shelf productivity hypothesis" (Numerical model analysis)
- Ocean Observing Programs [CI-ROOP] Scenario 2 (4.4.2): Objective driven observations with gliders
- Data Product Generation [CI-RDPG] Scenario 2 (4.6): Instrument Lifecycle
- Data Product Generation [CI-RDPG] Scenario 4 (4.9): Virtual Observatory
- Integrated Observatory Management [CI-RIOM] Scenario 1 (4.4.1): "A day in the life of a Test Pier Operator"
- Education and Public Engagement [CI-REPE] Scenario 1 (4.2.1): "What is the role of the ocean in the CO2 problem?"
The following four scenarios are examples identified as mandatory requirements for the CI. However, they are not meant to be exhaustive or all inclusive.
The central use case scenario supports the activities of scientific investigation through an environmental scientist or researcher. The OOI Integrated Observatory, specifically its CI component, provides the capabilities and user interfaces to perform this core application. In the most general case, the OOI Integrated Observatory is a federation of organizations.
As a prerequisite to use the CI capabilities, any CI science user needs to have an electronic identity established from their home organization, such as a university account and login. The general public is not required to have any credential, but has restricted access to dedicated educational spaces. Using existing trust relationships between the OOI and user organizations or a specific user registration process, the OOI can verify the user's identity when the user accesses a central OOI user interface (there can be many interfaces tailored to specific audiences). With valid credentials, the user can then access a project-specific workspace to interact with observatory resources from data products to instrumentation. This workspace, previously defined by a project administrator, represents a virtual observatory and provides the users with tools to access and manipulate the OOI resources of interest for a specific project setting.
The following scenarios derive from this basic understanding of the OOI CI mode of operation. There are several assumptions as prerequisites to any of these scenarios:
- Observational data is available in OOI archives with appropriate detail for the region and variables of interest.
- Large-scale model outputs are available as data products on OOI (either archived or automatically recomputable) for region and time frame of interest.
- Ocean model algorithms are available for the environmental processes of interest, taking (historic) observational data as well as larger-scale resolution model output as input. Such algorithms are configurable and tunable according to scientists needs. Data processing and transformation tools are available through the OOI to the scientist.
Analysis and presentation tools are available to analyze model output and data series.
All scenarios imply the following basic steps, with variants being discussed to illustrate CI capabilities and highlight different architectural constraints detailed in this document:
- Within a project workspace on OOI, research and import all data, model output, model algorithms and configurations;
- See Virtual Observatory Scenario (RDPG, Scenario 4);
- Make the necessary processing, transformation and configuration steps;
- Run the model and compare with expected observational data;
- Analyze and present results with respect to the hypothesis.
Figure 1. Large-scale ocean observatory scenario
Figure 1 shows a large coastal observatory comprised of long and short range coastal radar (CODAR) nodes and a mix of buoys and glider tracks covering most of offshore southern California. This constitutes a regional framework for coastal science processes and events composed of semi-autonomous resource nexuses (e.g., discrete buoys). At the node level, data gathering and resource allocation (e.g., power or bandwidth) is comparatively simple and can be implemented in local hardware or autonomous software. However, coordinating large numbers of nodes into a coherent scientific whole that is larger than the sum of the individual parts is a significant challenge.
Regarding just data access, we assume that the data provider is not part of the OOI. Thus, we enhance the basic scenario with the variation that observational data is not available within OOI archives, but it is accessible through the Internet (external database, published on a website, FTP, etc.). In this case, this external data source needs integration with the CI infrastructure.
The following are additional steps required for the integration:
- Configure external data source;
- Define or develop data transformation and processing required;
- Use OOI internal and external data within OOI workspace as input for an existing numerical model on the CI.
We can also extrapolate this scenario to other usage patterns. For example, linking the functionality of CODARs up and down the coast without human intervention is a major science requirement. Management of diverse types of data and their associated metadata is another. CI is needed to provide a consistent and automatic control of these and other aspects of the overall observatory. Hence, in a very real way, the concept of a regional framework is important at the operational as well as the scientific level. One of the major operations and maintenance challenges for a distributed ocean observatory is tracking and coordinating the state of observatory resources. Thus, through CI the science use case is also the operations use case.
Traditional data assimilation models operate in open loop form, incorporating retrospective or real-time data into the model run without altering the measurement protocols. Dynamic data-driven application systems (DDDAS; Darema, 2005) close the loop by allowing modification of sampling by the assimilation model. The assimilation model may change sample rates for selected instruments in response to an event. It could also steer instruments on a mobile platform (such as a ship) to locations where property gradients are largest in the simulation. Complexity builds up when we incorporate the addition or removal of fixed or mobile instruments from the domain of interest in response to model output.
Figure 2. Observatory comprised of ships, aircraft and autonomous vehicles linked to assimilation modeling capabilities on shore
Based on Figure 2, in this scenario, we assume that the ocean model algorithm is not available on the OOI; hence, we need an extension of an existing model or a new model. The following steps are necessary:
- Develop and add new/enhanced model to the CI
- Run new model with existing observational data and nested model output for initial and boundary conditions.
Accomplishing a DDDAS scenario with fixed instruments pushes further the complexity, by requiring a wide range of resource allocation, instrument control, and instrument communication services to coordinate the functionality of the assimilation model, the instrument suite, and the ocean observatory infrastructure. If some of the instruments are mobile or the sensor mix changes with time, then additional services for discovery and localization or tracking may be needed. Crosscutting requirements for time synchronization and security services also exist. Hence, a CI with such capabilities is of paramount importance to support this scenario.
Figure 3. Site on regional cable observatory containing power-intensive interactive instruments
Consider a more elaborate use case, which encompasses many heavily instrumented sites distributed around a regional cabled observatory (e.g., ten or more multidisciplinary moorings extending through the water column). This adds additional complexity through shared use of instruments and resources by multiple users and the difficulty of remote coordination of resources over large distances.
Figure 3 depicts a single science site where a diverse suite of sensors and actuators are deployed over a small area (for example, on the scale of a hydrothermal vent field) to accomplish multidisciplinary science. The sensor suite may include physical, chemical, and biological types, and the science mission may require frequent changes in their location or mix. Heavy use of stereo HDTV and high resolution acoustic imaging are anticipated, with concomitant demands on bandwidth and power resources.
Acquisition and storage of physical samples for later retrieval and onshore analysis may be needed. Accurate repeat positioning of actuators for sampling may also be required, imposing closed loop control constraints on the hardware and software infrastructure. This use case involves stringent demands on the shared use of instruments and other resources by many users. Quality of service, latency, and jitter requirements implied by real-time stereo HDTV and closed loop control of sampling actuators are strict.
We consider the following variation of the basic scenario: the required sensors and observational infrastructure exist with OOI, but they need to be reconfigured/interactively controlled to provide the desired resolution and frequency. This variation translates into a need for reconfiguration, tasking and interactive control of existing instrumentation. Thus, we have to perform the following steps:
- Simulation of ocean observing and modeling steps
- Develop a plan for reconfiguring or tasking instruments
- Await instrument reconfiguration approval
- Interactively control instruments during availability window
- Use additional collected observational data to run model and do analysis
From the CI perspective, a diverse set of services for resource allocation, time synchronization, instrument monitoring and control, bi-directional instrument communication, cross-calibration, coordination of sensing regimes (e.g., optical or acoustic), localization, tracking, and security are required. Closed loop control may not be feasible in the presence of high seafloor-to-shore latency without CI assistance, such as that used in remote surgery applications.
Looking a decade into the future, the sensor suite at ocean observatory sites of interest may consist of a mix of large numbers of low capability, low cost fixed sensors (e.g., for the measurement of temperature over an area) and small numbers of high capability, high cost sensors (e.g., in situ spectrometers) in mobile platforms.
Figure 4. A coordinated set of autonomous underwater vehicles
This combination simultaneously accomplishes a continuous areal-scale overview with high resolution and directed, local-scale resolution measurements in an economical fashion. The enabling technology that makes this approach feasible is a network of high bandwidth optical modems that provide a wireless extension of the observatory infrastructure, both making it possible to accommodate large numbers of sensors without physically attaching them to the observatory and allowing real-time access to fixed sensors and mobile platforms.
The mobile platforms (illustrated in Figure 4) may operate continuously to accomplish pre-programmed sampling missions or under human control for exploratory sampling. Arrays of sensors that fuse into coherent sensor networks are a rapidly evolving application in terrestrial monitoring. This can be accomplished by either linking all sensors to an optical modem network or through pervasive, direct peer-to-peer interconnection. Since the characteristics of the terrestrial wireless and seafloor optical environments are similar, it is reasonable to expect both methods to be widely utilized on the seafloor in the future.
This use case aggregates all of the requirements of the previous three scenarios, involving both resource-intensive applications and an ever-changing mix of mobile sensors that are complex in their own right, and whose operation must be coordinated in real-time. Additional services to provide for discovery of topology and location-aware routing in a time-varying network may be necessary. Sensor networks may also require group management and collaborative information processing applications. A cross-cutting requirement is one of simplicity; for example, low cost sensors with wireless links may not have the capability to process complex time services.
Consider the case that a new instrument or sensor is deployed, with the requirement that its observational data should be accessible throughout the integrated observatory.
- Deploy instrument on OOI infrastructure
- Sub-scenario: Develop instrument drivers and data processing
- Sub-scenario: Test new instrument before/after on OOI
- Develop data processing steps
- Run models and analyses
Consider the following extension of the Use case 4:
- The environmental process (e.g. carbon flux) should be presented to an educational audience, for instance in form of an interactive museum display
- An education application needs to be developed and operationally hardened on OOI hardware and software
- Develop educational application
- Use analysis and visualization widgets
- Idealize and simplify model output
- Provide interactive access to historic observational time series
- Install operational procedures for automatic data computation in regular intervals on OOI
|These use cases have been adopted from the longer OOI Science User Concept of Operations, initially developed as part of the OOI Cyberinfrastructure Conceptual Design.|
Dr. Adrian Chu of the University of Nebraska Oceanography Center has been working on an analysis tool for some time. The tool integrates several sources of oceanographic data into a small model, and produces a prediction when certain conditions are met. Data from multiple OOI observatories are blended together in the model. Dr. Chu will work with several international researchers from Canada (Dr. Nicole Jones), France (Mlle. Jeanne Fleuris) and Russia (Dr. Dmitri Istantov). He has previously used the OOI cyberinfrastructure to set up his collaborative work group and construct a virtual workspace for them to use. The group has then interactively modified and updated Dr Chu's model and added new features to it. The model has been tested by subscribing to data streams from the OOI observatories. This resulted in further model changes, and Dr Chu and his team are now ready for an operational run. As he starts running the software, although he doesn't fully appreciate it, the OOI infrastructure is performing a lot of steps to make sure the products show up where they are expected.
Just as Dr. Chu received pointers to subscribe to specific resources, his colleagues Dr. Jones and Dr. Istantov received resource descriptors when configuring the code to publish its observational events and summary data. To obtain this publication resource, the collaborators had to enter information into a publication metadata form that describes the source and nature of their publications. These metadata descriptions help users learn more about data products, assist administrators to troubleshoot any problems, and allows the CI to create a processing history for each of the data products. They are also critical to supporting search functions for the products created by OOI. Because the forms use dropdown menus with controlled vocabularies to fill out most of the fields, and auto-population of subfields based on user selections, all of the members of the team fill out the metadata form consistently and relatively quickly.
When the software runs, it uses the publication resources to announce to OOI that it is the source of this particular observational event, data stream, or data set. OOI can then connect the people or systems who have sought out and requested these observational events or data.
As modified by Dr. Jones, Predicitve Ocean Integration Model (POIM) publishes a prediction whenever it detects an observational event. Although she marked this output as an 'observational event', it also has the characteristics of a data stream: it arrives repeatedly over time (not necessarily at a consistent interval), the same type of information is in every record, and it is associated with a single data source, in this case a software process. The additional identification of this record as an observational event serves several purposes: it lets people find the item by searching within a list of publishable observational events, it helps describe the nature of the item (specifically, that arrival of the publication constitutes a message of significance), and it enables general-purpose event-oriented tools (event counters and summarizers, news bulletin generators) to be developed by OOI or other organizations.
Now that the software is executing, observational events will be published on an occasional basis. Each publication is logged by the OOI infrastructure, so that it can be reviewed later in the context of other activities. As described earlier, each publication can be obtained by OOI members in one of several forms: as a subscription, as an email or other notification, upon request ("show me the last observational event of this type"), or in archived form. People who have not registered with OOI can see data products (e.g., the archived logs of observational events), but not the more complicated services.
Just as the observational events are published (and accessed) as a resource, so too can the data summaries from the model. In fact, this same publication technique can publish any OOI data stream, including those generated by OOI instruments. The key characteristics necessary to publish data as a stream are that the data be described in advance, that the data creator (the software or instrument which generates them) use the OOI APIs to submit the data to OOI, and that the resource identifier for the data stream be associated with every data record that is output as part of that data stream. If developers writing software that creates data generation want to take full advantage of OOI's capabilities to integrate, display, and process data - and most developers on OOI are either strongly urged, or required, to do so - they must describe their data in a consistent format, and output it in a way that the format can describe. If a data source like a GPS (or modeling software) actually generates multiple types of data records (for example, one data record, one summary record, and one error record), then the developer must create a separate description for each record, get a separate resource identifier for each record, and publish each record type along with the appropriate resource identification. While this seems like a lot of work up front, it usually is fairly straightforward and saves a lot of time in postprocessing the data streams.
In this case, Dr. Chu's colleagues have used these features well, and Mlle. Fleuris in particular quickly understood the process of describing her outputs from the model. She created a metadata description for the model summaries she produced, defining the meaning of each item in the summary and the data source used to present it. Unfortunately, Dr. Chu's model output, which is the data source for her summary, is itself unpublished, since he is keeping it private for now. She plans to suggest to Dr. Chu that the model itself be published as an OOI resource, so that users can trace the sources for these summaries and predictions back through the entire chain of operations in the OOI workflow system. For now, she has referenced the unpublished data by description, as well as pointing back to the observational data streams that Dr. Chu's model uses.
Mlle. Fleuris set up the publication of the model to occur once every hundred times the model runs, as well as every time the model generates an observational event prediction. This allows the team to review the operation of the system over time and contrast its operation in predictive and non-predictive cases. Since the model runs hundreds or even thousands of times a day, this technique should limit the output to only a few outputs each day. This output volume is not very large, and the OOI infrastructure will respond accordingly by archiving them for an extended period. The holder of any reference to an OOI data stream can ask to view the data's historical records, as Dr. Chu did for the other data he wanted to review. If the reference holder has permission to view the data, they can be obtained from the OOI operational data archive. At this point, the events and model summaries can be viewed on-line or from the archive by the collaborators on the team. When the verification period (a period set by OOI policy, during which only proprietary access is allowed, so that the data can be evaluated and tested) expires, the data will be available to the public. At first Dr. Chu found this idea to be disturbing, but he has gotten used to it since he wants to use the full capabilities of the OOI.
In fact, Dr. Chu expects he will make these data products - the events and the summaries, at least - publicly accessible well before the validation period expires. This takes a minor effort on his part, and he knows a lot of colleagues will want to take advantage of the resulting predictions for their own studies. As an enlightened act of self-promotion, he intends to make the results available with a request to acknowledge him on any papers that ensue. While he knows he may only be acknowledged on half of the papers that use the work, his name will still become widely known as the originator of the information. From his previous experience in publishing data from an instrument, Dr. Chu knows there are several steps required to make his results publicly available, including certification and verification. First, he must certify that the data source meets the standards described in the OOI service agreements. For software, this is little more than has already been specified in the metadata, along with running the software on an OOI test bed system. Obviously, standards for instruments to be deployed at ocean depths are somewhat more demanding. The observatory on which the data source is deployed will confirm that the interface specifications have been met. This is done automatically for software, and with some manual confirmation for hardware interfaces. A further step required before releasing software is the verification step. This consists of evaluating the results from the data source to confirm that it is operating as expected. As Dr. Chu has already accomplished this step to his own satisfaction, reviewing the OOI products from his system should be simple. He is prepared to quickly go to the trouble of releasing his data to a wider audience, and establishing its verified status on OOI. For core instruments on OOI observatories, more detailed criteria must be met, including verification that the metadata describing the data source are correct and QA and QC procedures are in place.
There are several advantages to publishing the data - in this case, events, notification of an event, and model summaries - to a wider audience. First, it makes the data immediately available to the public through the data products that OOI produces. It also makes the services easier to reference and use within OOI - while this could also be achieved by changing the access permissions on his data sets, making the data public automatically changes those access permissions. Further, it makes it clear that Dr. Chu has reviewed the data sources and believes they are functional. Finally, making his data public advertises his products to a wider audience, since the OOI data product registries will only replicate complete metadata descriptions for a data product if that product is in fact publicly accessible. Once the OOI data product registries announce their availability, Dr. Chu's results will become visible in four other data publication registries (three of which are internationally well known), and he will get extra credit and attention for his work.
Dr. Chu has been following some interesting developments related to the publication of metadata in external registries. Some scientists have been quoting as part of their "publication rate" the number of entries they have in data product registries, and some search tools have begun indexing the registries as a way to provide more contextual information about data sources, data owners, and data systems. As a result, the "free registration" that OOI provides will likely have benefits for Dr. Chu's work.
Dr. Istantov wants to email a notification to each member of the team whenever the software detects an event, and he has used an almost identical mechanism as the others. Some of the metadata for his "data stream" are different, but much of the cyberinfrastructure used for publishing the notifications is the same as for events and other data streams. In fact, although he didn't realize it, Dr. Istantov's metadata form was made easier to fill out because Dr. Jones and Mlle. Fleuris had filled out almost identical ones earlier that was used to pre-populate some of the fields on Dr. Istantov's form. While he was testing his code, he sent the notifications to himself, but after completing his code changes, he updated the distribution list. Because the notification message is published via email, Dr. Istantov can select the destinations from a number of email address lists, including a list of aliases, of actual users, and of virtual laboratories of which he is a member. He configures the email destination for this published message to be Dr. Chu's newly created virtual laboratory, and awaits further word.
Each Release of the OOI Integrated Observatory Network has a different focus area and select focus user group, as described in Transition to Operations.
The Release specific use cases are covered in the Product Description for each Release: