View Source

See the [Release 3 Milestone Summary|R3 Deliverables and Milestones] page.

{metadata-list}
|| WBS | 1.2.3.21.40.01 ||
|| ID | M112 ||
|| Description | Capability extensions to index and query datasets by geospatial metadata beyond geospatial points supported in R2 ||
|| Deliverable ID | D053 ||
|| Deliverable | Data Access Service ||
|| Development Owner | M. Manning ||
|| Development Team | Data Processing ||
|| Developers | ||
|| Release | R3 ||
|| Status | Activated ||
|| Start Date | 1/6/2014 ||
|| End Date | TBD ||
|| Comments | ||
{metadata-list}

h2. Milestone Scoping and Requirements

h3. Requirements and Capabilities Tracing

{html-include:url=http://architecture.oceanobservatories.org/ooinet/milestonereq/M112.html}

h3. Use Cases

h5. Find resources using OGC Reference Model Geospatial Queries

!Screen Shot 2014-01-08 at 16.05.29 .png|border=1!
* Overlaps
** A user may select "overlaps" as the query operator. The result set are all resources where the resource's geospatial region intersects with the query bounding box.
* Within
** A user may select "within" as the query operator. The result set are all resources where the resource's entire geospatial region is wholly contained within the query bounding box.
* Contains
** A user may select "contains" as the query operator. The result set are all resources where the query bounding box is wholly contained within the resource's geospatial region.
* Disjoint
** A user may select "disjoint" as the query operator. The result set are all resources that contain a geospatial region that do not intersect with the query box.

h5. Find data within bounding area that meets the additional condition of a variable (e.g., temperature) within a given range
{note}
This use case is under consideration.
{note}
* Resources can be identified using geospatial search with data conditions.
** A user can specify a range for a data variable and the results will include data products that intersect the range.

h5. Advanced: Find out how many entities meet the search criteria.

* Provide the UI with a count of the result-set from a query in near real-time.


h3. Related Search issues in Jira

* [OOIION-1136 Inconsistent Search results using Advanced Search and Simple Text Search options |https://jira.oceanobservatories.org/tasks/browse/OOIION-1136]
Any inconsistencies should be eliminated now that Postgres will handle all queries. The logic for a ‘simple search’ was different than that used in the ‘advanced search’, but both should be passed to Postgres now for consistent results (and limits, order, etc.)

* [OOIION-1137 Advanced Search results do not allow scrolling through list |https://jira.oceanobservatories.org/tasks/browse/OOIION-1137]
UI related, not service related.
* [OOIION-1407 Maximum list size is 100 for Advanced Search results|https://jira.oceanobservatories.org/tasks/browse/OOIION-1407]
* [OOIION-1409 Advanced Search results page is not constructed properly |https://jira.oceanobservatories.org/tasks/browse/OOIION-1409]
UI related, not service related.
* [OOIION-1439 Advanced Search needs more documentation, prompts, and tool tips |https://jira.oceanobservatories.org/tasks/browse/OOIION-1439]
* [OOIION-1611 Advanced Search Temporal Search is not tailored by resource, and therefore gives unexpected results to the end user |https://jira.oceanobservatories.org/tasks/browse/OOIION-1611]
This isn’t really a bug. The users all searched for data with timestamps before the ion-alpha system was preloaded. They shouldn’t have found anything with the dates in the comments. However, the question of “what field is searched using the temporal bounds inputs on UI” was still unclear. The additional Postgres field geom_temp now has time bounds which can be searched with Overlaps, Within, etc similar to above to find matching sets.



h4. Steps

h2. Milestone Tasks

|| Task || Description ||
| Identify Geospatial search requirements | Analyze existing requirements and propose revisions for milestone work. Communicate with Marine IOs and other stakeholders to determine specific detailed features and analyze available documentation as needed. Work with the System Engineer on a requirements revision proposal. |
| Design Geospatial search behavior model | Develop detailed designs of this milestone's capabilities as needed for subsequent implementation and integration with the production system and other components. Identify core interfaces and dependencies to the system and to other compoents. Describe core interfaces provided. Get review from system architect and make design artifacts available in the CI architecture documentation. |
| Define enhanced geospatial indexes in database | Implement database indexes as designed and scoped using the integrated database technology. Develop tests to demonstrate the correct operation of the indexes. |
| Enhance discovery to use enhanced indexes | Implement the capability as designed and scoped. Develop unit and integration tests to demonstrate the correct operation of the code. |
| Enhance resource attributes for geospatial resources | Implement the capability as designed and scoped. Develop unit and integration tests to demonstrate the correct operation of the code. |
| Enhance business logic for geospatial resources | Implement the capability as designed and scoped. Develop unit and integration tests to demonstrate the correct operation of the code. |
| Integrate and test with production environment | Take all developed software capabilities of this milestone and integrate them with the remainder of the system. Demonstrate the correct function of the additions through successful automatic tests running against a fully launched system and by interactive demonstration on the test/alpha system. |
| Add Spatial Operator (view) | (ion-ux) Add button group to Advanced Search "GEOSPATIAL BOUNDS" form: ('spatial_operator' options - overlap/intersects,within,contains,disjoint) \[UI task\] \\ |
| Add Spatial Operator (controller) | (ion-ux) Create spatial_operator key in service API to pass to discovery service |
| Add Spatial Operator (service) \\ | (coi-services) Add spatial_operator parameter to discovery service \[_qmatcher_geo_loc\] |
| User Defined Limits (view) | (ion-ux) Add form dropdown for number of desired results to return from search (eg. 100,200,500) \[UI task\] \\ |
| User Defined Limits (controller) \\ | (ion-ux) Handle 'limit' field in service API (limit currently set in code not user option) \\ |
| Return number of total results (view) | (ion-ux) Display total number of search results available in DB. eg. showing 0-10 of 100 (14,567 available) \[UI task\] \\ |
| Return number of total results (service) \\ | (coi-services) Return total results available in DB from discovery service (beyond specified limit) \\ |
| Search Offset (view) | (ion-ux) Add "next _n_ button/link below search result navigation to get next set of results past limit. eg. showing 91-100 of 100 (click to retrieve next 101-200) \[UI task\] |
| Search Offset (controller) | (ion-ux) Create 'offset' key with value in service API to pass to discovery service |
| Search Offset (service) | (coi-services) Process an offset parameter in discovery service to pass to Postgres OFFSET value (allows search to skip _n_ records) \\ |


h2. Milestone Design

h5. Identify Geospatial search requirements

Analyze existing requirements and propose revisions for milestone work. Communicate with Marine IOs and other stakeholders to determine specific detailed features and analyze available documentation as needed. Work with the System Engineer on a requirements revision proposal.


h5. Design Geospatial search behavior model

Develop detailed designs of this milestone's capabilities as needed for subsequent implementation and integration with the production system and other components. Identify core interfaces and dependencies to the system and to other components. Describe core interfaces provided. Get review from system architect and make design artifacts available in the CI architecture documentation.

!https://docs.google.com/drawings/d/1rgejW4nYb7H98fYqM3v9ew-XD7LHh2sUO7_FIaCHVSY/pub?w=960&h=720!

h5. Define geospatial capabilities in database

Implement database indexes as designed and scoped using the integrated database technology. Develop tests to demonstrate the correct operation of the indexes.

Consider [OpenGEO Indexing Tutorial|http://workshops.opengeo.org/postgis-intro/indexing.html]. For tables in PostGIS that will have geospatial support, consideration for how to index and creating indexes will need to be designed and implemented. This logic is probably best suited for wherever the {{CREATE TABLE}} logic is implemented. A simple scan of the resource fields to identify any fields that are geometries should suffice, and then add an index to the database.


h6. The new resource registry postgres implementation supports and fills 4 geometry/temporal columns:

* geom: the geospatial center point
* geom_loc: the area bounding box for the resource
* vertical_range: the vertical range for the resource - postgres numrange type
* temporal_range: the temporal range for the resource \- postgres numrange type

So besides the point queries, we now also support intersect, overlap and containment queries against a resources bbox.

The geom colum is filled from the geospatial_point_center attribute, the geom_loc and vertical_range columns are filled based on the constraint_list and the north/south/east/west and depth min/max coordinate values.

See more details here: [https://confluence.oceanobservatories.org/display/CIDev/Postgres+Datastore]

h5. Enhance discovery to use geospatial information in data store

Implement the capability as designed and scoped. Develop unit and integration tests to demonstrate the correct operation of the code.

Documented example queries here. Please feel free to add to this list:
[https://confluence.oceanobservatories.org/display/CIDev/Postgres+SQL+Snippets]

Indexes are inherently used when available in PostgreSQL, without an index a brute-force or exhaustive search is used.

h7. The way to query geospatial via the discovery service is this code:
[https://github.com/ooici/coi-services/blob/master/ion/services/dm/presentation/discovery_service.py#L1128-L1130]
[https://github.com/ooici/pyon/blob/master/pyon/datastore/datastore_query.py#L137-L150]
[https://github.com/ooici/pyon/blob/master/pyon/datastore/postgresql/pg_query.py#L64-L75]


h5. Enhance business logic for geospatial resources

Implement the capability as designed and scoped. Develop unit and integration tests to demonstrate the correct operation of the code.

We will define a resource type hierarchy that supports geometries that are intended to be geospatially indexed.

* Geometry
* Point
* Circle
* Square
* Polygon

A Resource that intends to have a field or subset of fields that are geospatially indexed will include a field that is of a geometric type:

h7. See here for the code that fills the geom\* columns in the resource registry:
[https://github.com/ooici/pyon/blob/master/pyon/datastore/postgresql/base_store.py#L469]
[https://github.com/ooici/pyon/blob/master/pyon/datastore/postgresql/base_store.py#L369-L437]
_Please discuss any modifications here with MMEisinger._


{code}
Geometry:!Extends_Basic
Point:!Extends_Geometry
value: []
Circle: !Extends_Geometry
center: []
radius: 0.0
Polygon: !Extende_Geometry
points: []

DataProduct:
location: !Polygon
{code}

The resource registry will need to be modified so when the tables are created and a field of type {{Geometry}} is created an appropriate PostGIS data type is selected and a proper index is created to geospatially index the resource.

We will refactor the existing discovery code to use PostGIS capabilities for search and navigation as well as geospatial search. We will expose GIS searching capabilities through discovery service.
- Contains [http://postgis.refractions.net/documentation/manual-1.4/ST_Contains.html]
- Disjoint [http://postgis.refractions.net/documentation/manual-1.4/ST_Disjoint.html]
- Intersects [http://postgis.refractions.net/documentation/manual-1.4/ST_Intersects.html]
- -Overlaps- _we're using Intersects which is what the user expects_
- Within [http://postgis.refractions.net/documentation/manual-1.4/ST_Within.html]

h5. Integrate and test with production environment

Take all developed software capabilities of this milestone and integrate them with the remainder of the system. Demonstrate the correct function of the additions through successful automatic tests running against a fully launched system and by interactive demonstration on the test/alpha system.

h3. Design References and Context

* [cidev:Define key oceanographic geospatial search scenarios]
* [cidev:Dataset Metadata Indexing]
* [syseng:CIAD DM OV Discovery Service]
* [syseng:CIAD DM OV Index Management Service]
* [cidev:R2Cx Searches and Catalogs]
* [cidev:Evaluate GeoPortal for Externalization of Catalog and Discovery repository]
* [cidev:M166 PostgreSQL data store]
* [PostGIS|http://postgis.net/]
* [PostGIS Features|http://postgis.net/features]
* [Introduction to PostGIS|http://workshops.opengeo.org/postgis-intro/geometries.html]


h3. Design Notes

h4. R3 ElasticSearch Design etherpad

_includes postgis notes_

[http://etherpad.oceanobservatories.org/r3elasticsearch]

h4. PostGIS and Location Aware Resources

After our migration efforts for milestone [cidev:M166 PostgreSQL data store], we should be able to leverage the featureset of PostGIS to provide OOIN and clients with geospatial awareness for all system resoures that have a geospatial identity. Once PostGIS is installed and the PostgreSQL database has the GIS extension installed then extended resources to include GIS aware objects is simple.

*GIS Objects*
The GIS objects supported by PostGIS are a superset of the "Simple Features" defined by the OpenGIS Consortium (OGC). As of version 0.9, PostGIS supports all the objects and functions specified in the OGC "Simple Features for SQL" specification.

PostGIS extends the standard with support for 3DZ,3DM and 4D coordinates.

The OpenGIS specification defines two standard ways of expressing spatial objects: the Well-Known Text (WKT) form and the Well-Known Binary (WKB) form. Both WKT and WKB include information about the type of the object and the coordinates which form the object.

Examples of the text representations (WKT) of the spatial objects of the features are as follows:

- POINT(0 0)
- LINESTRING(0 0,1 1,1 2)
- POLYGON((0 0,4 0,4 4,0 4,0 0),(1 1, 2 1, 2 2, 1 2,1 1))
- MULTIPOINT(0 0,1 2)
- MULTILINESTRING((0 0,1 1,1 2),(2 3,3 2,5 4))
- MULTIPOLYGON(((0 0,4 0,4 4,0 4,0 0),(1 1,2 1,2 2,1 2,1 1)), ((-1 \-1,-1 \-2,-2 \-2,-2 \-1,-1 \-1)))
- GEOMETRYCOLLECTION(POINT(2 3),LINESTRING(2 3,3 4))

The database provides the capability to query against [*spatial relationships*|http://workshops.opengeo.org/postgis-intro/spatial_relationships.html]. With standard geometrical relationships: contains, within, touches, etc.

Here is a quick SQL example of the geospatial capabilities:

{code}
CREATE TABLE gis_example(id INTEGER PRIMARY KEY, geom GEOMETRY);
INSERT INTO gis_example VALUES (0, 'POINT(0 0)');
INSERT INTO gis_example VALUES (1, 'POINT(2 2)');
INSERT INTO gis_example VALUES (2, 'POINT(1 1)');
INSERT INTO gis_example VALUES (3, 'POINT(0.99 1)');
INSERT INTO gis_example VALUES (4, 'POINT(0.99 0.99)');
SELECT id,ST_AsText(geom) AS geom_text,ST_Contains(ST_GeomFromText('POLYGON((-1 -1, -1 1, 1 1, 1 -1, -1 -1))'), geom) AS contains FROM gis_example ORDER BY contains DESC;
id | geom_text | contains
----+------------------+----------
0 | POINT(0 0) | t
4 | POINT(0.99 0.99) | t
1 | POINT(2 2) | f
2 | POINT(1 1) | f
3 | POINT(0.99 1) | f
(5 rows)


{code}

PostGIS also supports parsers for standard industry shapefiles including KMZ, ESRI Shape files etc. This may play a role if we provide users with the capability of inputing system resources and defining shape boundaries for the resources.



h4. Application Resources that are geo-aware

|| Resource || Location type || Notes ||
| Observatory | point or polygon | |
| PlatformSite | point | |
| InstrumentSite | point | |
| DataProduct | point or polygon | polygon for glider |
| Deployment | point or polygon | |
| Site | point or polygon | |

h4. Application Resources that are geo-searchable
|| Resource || Path || Notes ||
| Instrument Device | via Deployed Site | |
| Platform Device | via Deployed Site | |


h2. Implementation Notes


h3. Discussion Notes

h4. Discussion Tuesday October 29th,

These are the steps I took in order to install postgis on a near-fresh machine:

The prerequisites is that python2.7 is installed via brew
{code}
brew update
brew install postgresql
deactivate # Deactivate any current virtualenv in python
which python # Should say /usr/local/bin/python if it doesn't then you need to fix your $PATH
pip install numpy
brew install postgis
# -- At this point there was an error with lzlib being installed and the SHA didn't match --
# Copy the SHA that it says it should be
vim /usr/local/Library/Formula/lzlib.rb
# Change the sha1 call to match what it was copied
brew install lzlib
brew install postgis

{code}

To verify that it was installed correctly:
{code}
createdb gis_example
psql gis_example
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
<ctrl-d>
{code}

h4. Discussion Tuesday 31 October

Prototype using a single column in the resource table to contain the geodata (each row represents a single resource)
* define the types of geometries required to represent various resource types: point, rectangle
* if a resource noes not have a geo-location then simple leave as null
* queries should be a standard PostGIS select and efficient:
** find all instrument devices of model CTDSMP37 in this rectangle
* OGC externalization plans are next phase
** see how much of the standard we can support with the above simple model.


h3. Initial Prototyping


h4. 1 Nov

*MMeisinger*

All, you can now try out the Postgres resource registry branch. It is ready to use for initial investigations and for call tracing. It works with the full demo of R2 alpha preload, UI and streaming except for discovery service/ES integration. No changes to coi-services required, other than change pyon and ion-definitions submodules, install postgresql and driver and add a bit of pyon.local.yml:
[https://confluence.oceanobservatories.org/display/CIDev/Postgres+Datastore] (see at bottom)
It's very easy to use and you can switch back and forth coi-services master and coi-services postgres_merge branch without issues. You don't even have to change pyon.local.yml

---

I've added this and other information to the "central" Postgres page on Confluence:
[https://confluence.oceanobservatories.org/display/CIDev/Postgres+Datastore]

---

I just enhanced the Postgres datastore to set a geometry column (currently based on the geospatial_point_center value). This works nicely for the BETA preload:
{code}
ion_sterling_ion=# select id,name,ST_AsEWKT(geom) from ion_sterling_resources where geom is not null;
id | name | st_asewkt
----------------------------------+------------------------------------------+----------------------------------------
632ced50e66443cabe6c2db5f593f782 | Beta Demo Site Alpha | SRID=4326;POINT(-117.23214 32.88237)
f2370c2be1d743379b4e98e1c97d3ef0 | Beta Demo Site Beta | SRID=4326;POINT(-125.399828 44.600045)
990648bf8416463392f83f925cff6301 | Beta Demo Site Gamma | SRID=4326;POINT(-122.249952 47.650011)
bdbcf25cd38a4ce59bfd123e32f656e9 | Beta Demonstration Station One | SRID=4326;POINT(-117.23214 32.88237)
6f164466ecf34e4ebbfc577be6d9dcaf | Profiler 200m Platform 104 Site | SRID=4326;POINT(-117.23214 32.88237)
c1d8a819caf44f46a5215b2640ba5b95 | Glider 001 - Mobile Assets Station | SRID=4326;POINT(-125.399828 44.600045)
680ef26c2f41405aa538d8cded2f97d9 | Instrument site 2 Demonstration | SRID=4326;POINT(-117.23214 32.88237)
f3db2a6f82be4ec1974930426651772d | Instrument site 7 Demo | SRID=4326;POINT(-117.23214 32.88237)
013768adcd394f32932e97b6778b3eaa | SBE37SMP CONDWAT L1 | SRID=4326;POINT(-122.249952 47.650011)
9c76441033ed42cc9c1f9054c5ad9454 | Platform Engineering Data | SRID=4326;POINT(-117.23214 32.88237)
e61c5cf24b324a3996c59c65a1550893 | Platform Engineering Data 200m Platform | SRID=4326;POINT(-117.23214 32.88237)
4801748bf896448c98153a093f81cca4 | SBE37SMP Raw | SRID=4326;POINT(-122.249952 47.650011)
44d8c1176fda4bfc9d418f8e8e2c4fbe | SBE37SMP Parsed | SRID=4326;POINT(-122.249952 47.650011)
4ab484fa8e044139a9f893cfee939a34 | SBE37SMP TEMPWAT L1 | SRID=4326;POINT(-122.249952 47.650011)
830cac9cd3c64ff4b87c79b5fb865f97 | SBE37SMP PRESWAT L1 | SRID=4326;POINT(-122.249952 47.650011)
443042055ec244a68d5108751d7fefe9 | SBE37SMP PRACSAL L2 | SRID=4326;POINT(-122.249952 47.650011)
a9798b8e6c63400084e1d710326321c4 | SBE37SMP DENSITY L2 | SRID=4326;POINT(-122.249952 47.650011)
a30829da02354b7c9f9c7e06398a08f9 | SBE Simulator Parsed Data Product | SRID=4326;POINT(-125.399828 44.600045)
(18 rows)
{code}
Then I tried a bounding box query:
{code}
ion_sterling_ion=# select id,name,ST_AsEWKT(geom) from ion_sterling_resources where geom && ST_MakeEnvelope(-120, 0, 0, 90, 4326);
id | name | st_asewkt
----------------------------------+------------------------------------------+--------------------------------------
632ced50e66443cabe6c2db5f593f782 | Beta Demo Site Alpha | SRID=4326;POINT(-117.23214 32.88237)
bdbcf25cd38a4ce59bfd123e32f656e9 | Beta Demonstration Station One | SRID=4326;POINT(-117.23214 32.88237)
6f164466ecf34e4ebbfc577be6d9dcaf | Profiler 200m Platform 104 Site | SRID=4326;POINT(-117.23214 32.88237)
680ef26c2f41405aa538d8cded2f97d9 | Instrument site 2 Demonstration | SRID=4326;POINT(-117.23214 32.88237)
f3db2a6f82be4ec1974930426651772d | Instrument site 7 Demo | SRID=4326;POINT(-117.23214 32.88237)
9c76441033ed42cc9c1f9054c5ad9454 | Platform Engineering Data | SRID=4326;POINT(-117.23214 32.88237)
e61c5cf24b324a3996c59c65a1550893 | Platform Engineering Data 200m Platform | SRID=4326;POINT(-117.23214 32.88237)
(7 rows)
{code}
It seems to work. An arbitrary number of extensions are thinkable

*LCampbell*
If you want to add PostgreSQL to your supervisor config scripts so that it's managed as a daemon by supervisor:

{code}
[program:psql]
command=/usr/local/bin/postgres -D /usr/local/var/postgres
autostart=false
autorestart=false
stopsignal=INT
{code}



h3. Status Discussion 31 Jan with Luke, Brian, Michael, Tim and Maurice
* Issues working thru the several layers of discovery search. Brian feels he has a handle on it now.
* Need to focus on a full suite of integration tests that demonstrate various search types ( overlap, bounding box, etc) for multiple search types
** There are 3 weeks allocated to integration with the UI, this is the time that will test UI-created searches
* Must define which application level resources needed for search
** which need location attributes
** which would be found by searching for an associated resource and how would that work
*** A device does not have location attributes but it is deployed to a site which does have location. Search for all devices in a bounding box.
** Temporal and geo search scenarios?