Skip to end of metadata
Go to start of metadata
This page explains how to install, configure and use the R2 Capability Container in the context of developing and running ION services and the ION system. Please also see the R2 Service Implementation Guide and the R2 Resource Development Guide.

Introduction

Pyon is the the name of the package and repository providing the basic ION Capability Container functions. If you are not part of the COI team, you will most likely not be developing code for the container, but instead services hosted by the container. This means you don't have know the gory details of the pyon repository. However, cloning pyon GIT is required for now as explained in the installation steps, below.

To develop ION services, agents and processes, work in a service repository instead, such as coi-services, which initially hosts all service skeleton implementations for all R2 subsystem services. Later, we will move these services into their own repositories, such that teams can work independently

Environment Setup

Terms and Dependencies

It's complicated describes it at first sight. But it does not have to be. Let's define a few terms:

Term
Description
Example
pyon repository
The GIT repository named pyon with the source code of the R2 ION Capability Container. Unless you are interested in its inner workings, there is no need for you to look at this repository and its contents. It's use is content of this tutorial. The COI team works on the pyon repository and releases a versioned egg for new developer releases (towards the ION release).
git clone git@github.com:ooici/pyon.git
pyon package
The Python egg that can be installed from ooici.net/releases. This will have the tested pyon code in it. This is what you need when you want to start pycc. Typically this egg and all its dependencies are installed automatically in your virtualenv by running buildout.
easy_install pyon
ipython -c "import pyon"
bin/pycc
The program that starts the ION R2 capability container. It requires the pyon package installed as dependency via buildout. Just run it from a service repository (a startup root), see below.
bin/pycc --rel res/deploy/hello.yml
bin/nosetests
The program that starts the unit and integration tests. It requires the pyon package installed as dependency via buildout. Just run it from a service repository (a startup root), see below.  
coi-services
The GIT repository that you develop service source code in. Define this as project in the IDE of your choice, such as PyCharm. The initial service repository is "coi-services", where initially all R2 services will be located. Later, we move these services into their own subsystem service repositories, managed by the subsystem leads.
git clone git@github.com:ooici/coi-services.git
ion-definitions repository
The GIT repository named ion-definitions that contains service and object definition files (in YAML). This repository is added as a git submodule under extern/ion-definitions to pyon and the coi-services, so that you can edit definitions in place.

Symlinks exist from the root of a service repository to obj/ and res/

Note: It is very difficult to work with git submodules correctly. Please see the Git Submodules section, below.
git clone git@github.com:ooici/coi-services.git
git submodule update --init
ls res/
ls obj/
cd extern/ion-definitions
git status
startup root
The directory from which you start one or multiple pycc capability containers, with a defined set of processed running in the container. Every service repository has the buildout environment to act as startup root. Use coi-services initially.
cd coi-services
bin/pycc --rel res/deploy/hello.yml
ion-integration repository
The GIT repository that includes all service repositories and pyon as dependencies, so that you can start a container with any ION service from this directory. (Not yet existing for R2)
 

Directory and file layout

All necessary configuration files for the R2 container can be found in the 'ion-definitions' repository under the 'res/' directory structure.

Every service repository, e.g. 'coi-services', will include 'ion-definitions' as a git submodule. There is a symlink to res/ in the repository root. The mount point is actually 'extern/ion-definitions'.

res/
  |
  |--- config/
  |        |
  |        |--- logging.yml : default logging handler and level configuration
  |        |--- pyon.yml : centeral configuration file for ion services and modules
  |
  |--- deploy/ : home of all rel files, used to start processes in a container
  |        |
  |        |--- r2deploy.yml : standard R2 services deploy file
  |
  |--- apps/ : home of all app files, used to define apps that can be started in any container

Installation

Please follow the detailed installation instructions in the coi-services README: https://github.com/ooici/coi-services/blob/master/README

Note: Some steps from the New Developer Tutorial are assumed, such as the installation of GIT, XCode and the directory structure.

Container Configuration

Configuration order

Configuration is applied in one of two variants: local configuration or remote configuration. The variant is chosen based on the pycc option "--config_from_directory"

Configuration is loaded in the following order (higher numbers override lower configuration)

Variant 1: Local configuration (default)

  1. res/config/pyon.yml – Basic config properties. Keep this unchanged; it is version controlled
  2. res/config/pyon.local.yml – For local modifications such as server addresses
  3. res/profile/<profile>.yml – Config overrides from a container profile
  4. res/apps/* app file config block – Modified config within one app
  5. res/deploy/* rel file config block per process/app – Modified config within one app/process instance
  6. process spawn config override – Modified config provided by spawning process
  7. command line config startup argument (dict literal or list of files to load) – Set via pycc
  8. command line key/value override – Set via pycc

Variant 2: Remote configuration configuration

Stage 1: Store configuration in directory (via bin/store_interfaces)

  1. res/config/pyon.yml – Keep this unchanged; it is version controlled
  2. res/config/pyon.local.yml – For local modifications
  3. command line config startup argument (dict literal or list of files to load) – Set via pycc
  4. command line key/value override – Set via pycc
  5. Store config in directory

Stage 2: Apply configuration to container

  1. res/config/pyon_min_boot.yml
  2. res/config/pyon.local.yml – For local modifications (to minimal boot config)
  3. command line config startup argument (dict literal or list of files to load) – Set via pycc
  4. command line key/value override – Set via pycc
  5. Used to connect to the directory but is thrown away afterwards
  1. Load configuration from directory
  2. res/config/pyon.local.yml – For local modifications (to minimal boot config)
  3. res/apps/* app file config block - Modified config within one app
  4. res/deploy/* rel file config block per process/app - Modified config within one app/process instance
  5. process spawn config override - Modified config provided by spawning process
  6. command line config startup argument (dict literal or list of files to load) – Set via pycc
  7. command line key/value override – Set via pycc

Note: When system.auto_bootstrap=True, then Stage 1 of Variant 2 is executed automatically,

All configuration is merged into one hierarchical namespace, see below. Not all config entries are available to all processes.

Configuration files

All configuration file content is defined in YAML notation. Each file type has it's own purpose and format as described below.

Container config file: res/config/pyon.yml

The central configuration file for the ION system and capability containers resides in the 'res/config/' dir and is named 'pyon.yml'.  The contents of this file is YAML. It is typically not necessary to modify this file when adding services. Use service specific app file instead (see below for how).

Overriding entries from the standard 'pyon.yml' file is possible. Create an empty file 'res/config/pyon.local.yml' and insert the entries that you want to add or modify in YAML syntax. The local file entries are merged into the configuration provided in 'pyon.yml'. Note: the 'pyon.local.yml' will be ignored by GIT.

Container config local override file: res/config/pyon.local.yml

The pyon.yml file is version controlled in GitHub and contains the basic configuration values for the system and the container.

In case you want to define your own local overrides, e.g. to set the database and message broker URLs, or to enable or disable certain features, create a file res/config/pyon.local.yml and add only the keys you want to redefine. These properties will be merged in on top of the pyon.yml.

Do not define empty keys for entire branches of the configuration tree

The following is ok:

The following is WRONG:

It will delete the entries for couchdb and system, not leave them alone (because no child keys are specified). This is intended behavior. Use it with care

Deploy files (rel files)

A container deploy file (or rel file in OTP/R1 analogy) describes a list of apps to start in one container.

For simplicity, assume one app starts one service process.  The more general concept is that an app is a piece of code that can start and stop one or many processes to provide a specific function. We would like to not use this for now to keep things simple.  Below is sample deploy file content.

Keep the boiler plate at the top of the file and change the description and deploy name as appropriate. You, as a service developer, will be interested in adding one or more entries under the apps tag. The order of the apps entries is important as the order in which the apps are started in the containter.

Looking at the datastore app definition, we see that the information about how to instantiate an instance of the app is described in the processapp element.  This is a list comprising of <process name>, <module path>, <class name>. Note: For now, we place service modules either in examples/ or in ion/services/<subsystem>/.

In order for a deploy file to be startable by pycc, two conditions must be satisfied:
  1. The source code of the service process must be in the PYTHON_PATH. If the source code is not there, the container will not start the deploy file. 
  2. The service interfaces and base classes must have been generated from the service definition YML files, using the "bin/generate-interfaces tool".

This means you cannot start all deploy files from all service repositories. It also means if someone changes the YML service definition, you need to generate the interfaces again.

*Setting app configuration in the deploy file:*

Providing app specific configuration parameters in the deploy file is entirely optional. In the example above, the identity_registry app will just utilize the common configuration values. However, If you do wish to provide configuration values, they go under 'config' element within the app block. There will be guidelines for how to structure your app specific configuration content. However, it is important to understand how this configuration information is applied in the running system. At server startup, the contents of the app config block in the deploy file is applied on top of the general configuration information from the 'pyon.yml' file at the root level. This allows service developers to optionally override common configuration values as well as providing service specific configuration values.  An example of this in action is shown below.

Imagine if the pyon.yml file contained a block like the following and you wanted to override the 'persistence_type' value for the datastore app:

In the rel file, you would define the following config block:

Because the 'storage_backend' element with sub-element 'persistence_type' exactly matches the hierarchy of elements in the root of the common configuration, this value will override the common configuration value.  However, because the datastore config block does not define a 'num_versions_to_save' element, this value is derived from the common configuration.  The net configuration exposed the datastore service is:

Extended container deploy file example with app files

The hello.yml deploy file shows two further possibilities for defining apps:

  1. (hello0) Define a processapp directly in the deploy file (similar to above). This starts one process as named and provides optional configuration
  2. (hello1) Include an app in the deploy file, processapp style. The app is defined in a separate file (here 'res/apps/hello1.yml'). The app file contains a processapp to define one process to be started. Both app and deploy files can contain config entries that are provided to the starting process. The deploy config entries override the app file config entries, which override the common configuration.
  3. (hello2) Include an app in the deploy file, full definition style. The app is defined in a separate file (here 'res/apps/hello2.yml'). The app file defines a python module that starts code. Both app and deploy files can contain config entries that are provided to the starting process. The deploy config entries override the app file config entries, which override the common configuration.

Logging Configuration

The logging system configuration is defined for the pycc container is defined by two files and optionally a third provided on the command line.

The base file is res/config/logging.yml, and specific overrides can be made in an optional file res/config/logging.local.yml or using a third file provided to the pycc command-line option --logcfg.

The project repository includes an initial logging.yml file suitable for development use, but there are other templates which may be used instead.  Just copy one of these templates to logging.yml instead:

  • production-logging.yml -- suitable for use in deployed production containers
  • gumstix-logging.yml -- suitable for use in offshore containers that have limited network connectivity
  • developer-logging.yml -- to restore the developer configuration

These configuration files direct log message to one or more handlers that display or save the log messages to files or a centralized server.  Which log messages should be handled and which should be ignored is controlled by assigning levels in the hierarchy of loggers.

A logger is defined using dot notation and can define a level for messages to that logger or child loggers.  For example, if you define:

pyon:

  level: INFO

pyon.datastore:

  level: DEBUG

Then messages logged by the pyon/_init.py module or pyon/util/process.py will handled if they are INFO level or above, and ignored if they are DEBUG level or below.  Messages logged by pyon/datastore/init_.py or pyon/datastore/couchdb/couch_util.py will be handled if they are DEBUG level or above, and ignored if they are TRACE level.

The logger names correspond to the project modules and packages; and the levels are: TRACE, DEBUG, INFO, WARNING, ERROR.

IMPORTANT: The logging.local.yml or alternate configuration file provided on the command line should generally only change levels.  Do not add handlers to loggers or define new formats in these files.  What seems to work in one situation will not merge well in another situation because the same local overrides are applied to multiple different base configuration files as COI, CEI and MI components all interact.  (For example, when pycc spawns a new MI process.)

More information about logging large deployments can be found here.

Logging information for developers is here.

Container Startup

Start the container by invoking bin/pycc

Command line arguments:

Command line configuration:

The configuration values provided through the command line override the values in both the common configuration file ('pyon.yml') and the local configuration file ('pyon.local.yml')

Container Interactive Shell

If you do not daemonize the container process, you can enter the following ION-specific commands at the ipython shell:

Container Capability Profiles

A capability profile determines the capabilities a container starts, available to hosted container processes. Capabilities include things such as:

  • connection to AMQP broker
  • datastore manager, managing all kinds of database access
  • connection to resource registry
  • governance controller and policy interceptors
  • stats collection
  • container agent interface
  • writing a PID file on start
  • having a UNIX signal handler to emit greenlet information

The container's profile is configured using the container.profile configuration entry, e.g. made in pyon.yml, pyon.local.yml or through the command line. The profile must be defined in a file located in res/profile/\<profile_name\>.yml. Alternatively a path to a YML file can be provided, such as res/profile/gumstix.yml

The list of available "core" capabilities is defined in res/config/container_capabilities.yml

In a profile, capabilities can be enabled or an alternative implementation can be given (e.g. for embedded simplified container variants). This enables control on two levels:

  • Using a capability or not (e.g. specific containers don't need a broker connection and all dependencies that come with it
  • Changing a capability (e.g. the gumstix container does not have a Couch database but provides a datastore using a file read/write backend)

Container processes and other capabilities can check for the presence of a capability in the container using this code snippet:

Deployment of Services and Processes

Types of processes

  • A service is an interface behavior
    • A service interface is the definition of this behavior (a set of operations with in/out params in a YAML file)
    • A service instance is the availability of this behavior in a network, typically provided by a set of service worker processes
  • A process is an active interacting software program, running in one capability container
  • A process definition is the packaged source code for a type of processes
  • An agent is a process that provides a specific agent interface behavior
  • A stream process is a process that receives a sequential flow of data messages and acts on them

The Pyon capability container distinguishes the following types of processes:

Process type Process queue Listen queue Publish queues
Service Interface
Description
service (default) service name or as configured
determined by service name Dispatches incoming messages to service methods
stream_process subscription queue as configured (if configured)
implicit (has an operation "process") Processes a sequential stream of self-contained data messages
agent resource_agent.yml
Represents a resource or a user with a defined interface; can negotiate
standalone determined by service name A process that has a callable message API, but is not advertised as a service and has no shared listen queue
simple Performs something actively without being addressable via the Exchange
immediate Starts, performs a certain action (including sent messages) and stops

The process' base class (e.g. BaseService, StreamProcess, SimpleProcess, StandaloneProcess, ResourceAgent) determines the process type. Just extending the proper base sets the process type

Process deployment configuration

The type of the process should be set by extending the appropriate base class (see above).

In the deploy (rel) or app file for the service, provide configuration (note: the default is "service" and can be omitted):

Process life cycle

The following operations are available in the life cycle of every process

Operation Purpose Pre-conditions Post-conditions
_init_() Instance initializer. This is for framework use only
DO NOT USE
DO NOT USE
on_init() Performs basic initializations prior to starting
Basic process instance attributes are set, configuration is available
Process is initialized and can be started.
on_start() Prepares the process to start
on_init() called before.
Process can be used
on_stop() Prepares the process to stop
on_start() called before Process queues disabled
Process stopped, but can be restarted again
on_quit() Performs
Process queues disabled; on_quit() never called before
Process terminated immediately

Working with the Datastore (CouchDB)

CouchDB comes with a sweet web admin interface so you can see what the data store is doing when you run pyon on your laptop http://localhost:5984/_utils.

Here are some nifty screen shots explaining a little bit about the web admin interface and how pyon uses couchDB

CouchDB Web Admin Secrets Explained

Working with the broker (RabbitMQ)

RabbitMQ has a great web admin interface plugin so you can see what the queues bindings and exchanges are doing in RabbitMQ when you run pyon http://localhost:55672.

For details about how to install the plugin, check out rabbitmq management plugin
or for the quick and the dirty...

Now restart your rabbitmq server... and you should be able to open the web admin interface

RabbitMQ Web Admin Secrets Explained
Don't forget - to clear the broker after running pyon you must restart rabbit or the queues, bindings and exchanges will stick around. Auto-delete is off by default!

Working with the ElasticSearch

This is mostly for those of us who are going to merge in the index management service, if you are not you don't need elasticsearch just yet.

To launch an instance in the foreground (useful for debugging)

Additional nodes (instances of elastic search) are automatically distributed and load balanced on a single machine.

Using Supervisord to manage couchdb, rabbit and other services

Supervisord provides a great tool for managing services running in your OS. It allows you to start, stop, restart and monitor the logs for user and root processes such as couch and rabbit
This is a great way to clear pyon queues and binding in rabbit between test runs
Restarting couch does not clear couch - it is persistent

Supervisord Web admin page

Installation on mac is really easy.
Supervisord is a python daemon that should be installed in your default system python.
Make sure you are not in a virtual env and run

Configuration is done in your /etc/supervisord.conf
You can copy this example that runs postgres and rabbit!
Be sure to change the user name to run couchdb under your user name.

To start or stop supervisor:

You can also use supervisorctl to start, stop and resart the programs defined in your config

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.