The unstructured grid services prototype will be capable of delivering unstructured grid numerical model output in a standard form through a dap service which will allow for subsetting and time averaging of the data.
Unstructured grids are increasingly popular for modeling complex domains (i.e., coastlines), but complicate processing and storage. We are developing simpler methods informed by research in computer science.
Why Unstructured Grids are Complicated
Why are unstructured grids so difficult to work with? Consider Figure 1. At left, the highlighted cells of the structured grid represent a region of interest to a user. These cells may be addressed by a simple range query over world coordinates (latitude and longitude), which is trivially translated to a range query over computational coordinates in the corresponding representation in memory or on disk. We say that the two coordinate systems are spatially coherent — cells that are near each other in world coordinates also tend to be near each other in computational coordinates. At right, a user-selected region from an unstructured grid consists of four triangular cells. These four cells may appear anywhere in the overall representation. In general, representations are not spatially coherent — knowing where a cell is in world coordinates gives no hint as to where to find it in computational coordinates.
This simple fact has profound consequences for interoperability and performance. The algorithms developed to operate on unstructured grids use a variety of tricks and conventions; we say that they are tightly coupled to representation details such as cell order and implicit conventions for expressing neighborhood relationships. For example, particle tracking algorithms must access a local neighborhood of velocity values to determine where a moving particle will go next. Therefore, the algorithm must gather up the velocities nearby to the particle's current position. If these velocities are nearby in the representation, then lookup is easy, and cache performance is good. However, if the velocities could appear anywhere in the representation, then the algorithm must search for them, or build some kind of index beforehand, or do some form of guess-and-check to compute where the particle goes next. In practice, we find all of these solutions and many more, none of which are compatible with each other's representation. None of these solutions have anything to do with the underlying science of the problem. Rather, they are consequences of physical data dependence---an artificial coupling between algorithm and representation. The UGRID software, together with the underlying formalisms of the GridFields model, separates what needs to be done, from how to do it in order to achieve interoperability.
Figure 1: (left) Structured grid representations are spatially coherent allowing straightforward implementation of basic manipulation tasks such as selecting a region of interest. (right) In contrast, unstructured grid representations are not spatially coherent, complicating processing. This simple distinction has profound consequences for interoperability and performance.
UW, OpenDAP, and UCSD are building a pilot demonstration of interoperability between Gridfields and Hyrax via a subsetting operation over an unstructured grid data source returning a UGRID-compliant dataset. All code and supporting documentation will be deposited on the OOI Confluence website.
The demo will service an OpenDAP request over unstructured grids. The request will perform simple subsetting. The returned stream will be compliant with the UGRID model being developed by David Stuebe, Rich Signell, and Bill Howe. The request url may include a custom dispatch handler call of the form
where expr is a conditional expression involving the attributes of the source grid. For example, a bounding box expression "x between 29000 and 31000 and y between 28000 and 31000" can be encoded as follows:
The Hyrax server will translate this expression into an equivalent GridField query, evaluate it, format the result, and resturn the result using the UGRID format.
1) GridFields is a library for manipulating unstructured grid datasets that are difficult to handle with conventional tools such as NetCDF (see appendix). In order to be useful within the OOI framework, we need to show that the GridFields library can be integrated with the Hyrax Server using the Custom Dispatch Handler.
2) Hyrax Data Server is a framework to support multiple transport-level communication and interprocesscommunication protocols. The OLFS component is responsible for implementation of DAP over HTTP. This component receives DAP requests over HTTP, parses the request and makes a series of requests to the BES component that result in a serialized DAP object. The OLFS then packages that serialized DAP object as specified by the DAP over HTTP specification and returns the result. Important features of this architecture are that the OLFS and BES need not run on the same computer, can employ secure (SSL) communications, and need not be written in the same programming language (in fact, in Hyrax 4.x the OLFS is written in Java while the BES is written in C++). Finally, the protocol used to communicate between the OLFS and the BES has been designed so that one OLFS can 'talk' to several BES instances.
For the OOI, this architecture will be reorganized to further separate the transport method from the protocol that is transported. This abstraction will allow the CI to use a common messaging system to communicate with and control all services. For the prototype we will demonstrate the system using wrappers while presenting a design for an integrated solution.
For more details, please see the Hyrax Wiki