Pipeline (skypy.pipeline)

The pipeline package contains the functionality to run a SkyPy simulation from end to end. This is implemented in the Pipeline class and can be called using the skypy command line script.

Running skypy from the command line

skypy is a command line script that runs a pipeline of functions defined in a config file to generate tables of objects and write them to file. For example, you can use skypy to run one of the Examples and write the outputs to fits files:

$ skypy examples/galaxies/sdss_photometry.yml sdss_photometry.fits

To view the progress of the pipeline as it runs you can enable logging using the --verbose flag.

Config files are written in YAML format and read using the load_skypy_yaml funciton. Each entry in the config specifices an arbitrary variable, but there are also some particular fields that SkyPy uses:

  • parameters : Variables that can be modified at execution

  • cosmology : The cosmology to be used by functions within the pipeline

  • tables : A dictionary of tables names, each resolving to a dictionary of column names for that table

Every variable can be assigned a fixed value as parsed by pyyaml. However, variables and columns can also be evaluated as functions. Fuctions are defined as tuples where the first entry is the fully qualified function name tagged with and exclamation mark ! and the second entry is either a list of positional arguments or a dictionary of keyword arguments. Variables and columns in the pipeline can also be referenced by their full name tagged with a dollar sign $. For example:

examples/config.yml
parameters:
  hubble_constant: 70
  omega_matter: 0.3
cosmology: !astropy.cosmology.FlatLambdaCDM
  H0: $hubble_constant
  Om0: $omega_matter
n_galaxies: 1000
tables:
  galaxies:
    redshift: !skypy.galaxies.redshift.smail
      z_median: 1.0
      alpha: 1.5
      beta: 2.0
      size: $n_galaxies
../_images/index-1.png

When executing a pipeline, all dependencies are tracked and resolved in order using a Directed Acylic Graph implemented in networkx.

Using a pipeline from other code

SkyPy pipelines can be executed programmatically from other code. Consider the following example configuration:

examples/pipeline.yml
parameters:
  median-redshift: 1.0
galaxy-redshifts: !skypy.galaxies.redshift.smail
  z_median: $median-redshift
  alpha: 1.5
  beta: 2.0
  size: 1000

The Pipeline class can be used to load the configuration file and run the resulting pipeline. If the configuration defines a parameters section, the definition can be accessed and individual parameter values can be changed for individual executions of the pipeline:

import matplotlib.pyplot as plt
from skypy.pipeline import Pipeline

# read the example pipeline
pipeline = Pipeline.read('examples/pipeline.yml')

# run the pipeline as given
pipeline.execute()

# plot the results for the given parameters
plt.hist(pipeline['galaxy-redshifts'], histtype='step', density=True,
         label='{:.2f}'.format(pipeline.parameters['median-redshift']))

# change the median redshift parameter in a loop
for z in [1.2, 1.4, 1.6, 1.8, 2.0]:

    # median redshift parameter
    parameters = {'median-redshift': z}

    # run pipeline with updated parameters
    pipeline.execute(parameters)

    # plot the new results
    plt.hist(pipeline['galaxy-redshifts'], histtype='step', density=True,
             label='{:.2f}'.format(parameters['median-redshift']))

# show plot labels
plt.legend()
plt.xlabel('redshift')
../_images/index-2.png

Reference/API

skypy.pipeline Package

This module provides methods to pipeline together multiple models with dependencies and handle their outputs.

Functions

load_skypy_yaml(filename)

Read a SkyPy pipeline configuration from a YAML file.

Classes

Pipeline(configuration)

Class for running pipelines.

Class Inheritance Diagram

Inheritance diagram of skypy.pipeline._pipeline.Pipeline