Skip to end of metadata
Go to start of metadata

Unit Testing

MI makes use of the python nosetest framework to perform unit testing.  On top of that we have written our own base testing classes that provide common testing helpers.  

Nosetests considers each function whose name starts with 'test' to be a test it should run.  Prior to running each test it will run the setUp function.

To run any nose test for MI, you must be in the top mi-dataset directory, so import module paths are correct.

When the test or tests are complete, nosetest will print out either 'OK' or Failed and give you a description of how many tests passed and how many failed. 

Parser Unit Testing

One can run all unit tests on a single dataset parser using the command:

nosetests <path to test file starting at mi>.py.


To run a single test within a parser unit test class, one can use the command:

nosetests <path to test file starting at mi>.py:<Class name>.<test name>


Unit testing with nosetests is also supported within pycharm.  In order to execute a unit test, open the desired unit test file, right select within the class or function, and select "Run 'Unit Tests in <name>'", or "Debug 'Unit Tests in <name>'".  Select "Run > Edit Configurations" and change the "Working Directory" to the path to mi-dataset in order to not get errors.

Code Coverage Testing

In order to perform code coverage from the command line environment, you should install the Python coverage module via pip.


In order to use the Python coverage module, first run nosetests with the "--with-coverage" option to generate coverage data that can be processed by the Python coverage module.


Running nosetests with the --with-coverage option creates/updates a .coverage directory with coverage information.  After running nosetests with the --with-coverage option, run the coverage command with an argument like html to view code coverage information from within a browser.


Then you can run firefox providing the path to the htmlcov index.html.


Code coverage is also supported with the use of pycharm.  To enable code coverage during unit testing, open the desired unit test file, right select within the class or function, and select "Run 'Unit Tests in <name>' with Coverage", or "Debug 'Unit Tests in <name>' with Coverage".

Initially, you will see the following error. Just select the "enable" link.

Unit Test Base Class Information

** This describes dataset driver unit tests from the old marine-integrations repository, this is not present in the new mi-dataset repository **

There is one file, mi/idk/dataset/ which contains the base classes for integration (DataSetIntegrationTestCase) and qualification (DataSetQualificationTestCase) tests, and a common DataSetTestCase class which has common pieces for both.  

Common Tests

Some commonly used functions from the DataSetTestCase are:


Setup is called at the start of each test.  The base test class logs (at the debug level) the name of the test it is starting, initializes the test configuration, and clears sample data (which essentially just cleans out the harvester test directory).  

The test configuration is specified at the start of your test file inside DataSetTestCase.initialize()

create_sample_data(file1, file2(optional), mode(optional), create(optional))

This function creates a test file by copying the data in the first file from your resource directory into a new file in your test directory with the name of the second file.  The first data file must be located in the resource directory associated with your driver (i.e. mi/dataset/driver/wfp_eng/wfp/resource for wfp_eng_wfp driver).  The newly created file will be located in the test directory, as configured in the startup_config harvester directory

startup_config = {
   DataSourceConfigKey.RESOURCE_ID: 'wfp_eng__stc_imodem',
   DataSourceConfigKey.HARVESTER: {
      DataSetDriverConfigKeys.DIRECTORY: '/tmp/dsatest', <<<<<<<<<<<<< This is your test directory where the harvester will look for files
      DataSetDriverConfigKeys.PATTERN: 'E*.DAT',
      DataSetDriverConfigKeys.FREQUENCY: 1,
  DataSourceConfigKey.PARSER: {}

If no second file name is provided, the original filename will be used as the name of the newly created file.  Mode can provide the linux file mode (defaults to 644), and the create flag provides an option to create an empty file even if the original file (file1) is not found.  

get_file_state(path, ingested(optional), position(optional))

This function creates a dictionary with the format of the harvester state, filling in the file size, modification time, and checksum.  If the ingested flag is passed in, the ingested flag will be set in this dictionary (default to false).  If the position is passed in a parser state will be created with a position state key and the passed in position, otherwise the parser state will be an empty dictionary.

Python Code Profiling

Install CProfileV if not already installed:

Run cProfile to generate cProfile output:

Run cprofilev to allow view of profiling output:

Examine profiling output using firefox:

Driver Integration Testing

Access the following link for information on executing driver tests with the Validate Data Tool:

Test Data

There is generally one or more files test files included with the IDD.  However, there is much more test data available on the data server:  This can be ssh-ed into using the same login information that is used to get to the VM.  The goal is to test the driver with a set of representative data from the platforms it is expected to be compatible with prior to integration.  The compatible platforms and data paths should be found in the file acquisition section of the IDD.  The starting directory to look in on this server is /home/whoi/OMC.  This directory contains sub-directories for each currently existing platform. This is actual test data received from the instruments, and is in a read-only format, you should not have permissions to edit or delete any files here.  There may be platforms mentioned in the IDD which are not present here yet, which means that platform is not operational yet.  Generally inside the platform directories you will find a sub-directories with the form X00001, D00001, and R00001.  The X stands for pre-deployment, and the D stands for deployment, and R which stands for recovered.  The OOI Deployment Status Dashboard ( ) may be helpful in determining which platforms have been deployed and should have data.  In each of the relevant platforms, copy a few files of data from the data server to your VM.  You can use the linux wildcard to limit the number of files (for instance '*1.dat') when copying.  To copy files directly from the server to your VM you can use scp on the VM:

scp <username><path to file or files> <path to locate files on VM>

To make this more efficient, you can create a tar file prior to moving files over.  The tar file needs to be created in /tmp since you don't have write permissions in the data area.  This can be done with:

tar cvf /tmp/<tar_name>.tar <files to include>

Keep track of which test files you are copying and their original full paths so you know which platforms have been tested.  These files should be run through an ingestion test, which is described below.  The test files used in the ingestion test will be documented on the driver page: Drivers.  .

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.