Classical scripting#
After initialising a new project directory, a number of
processing steps can be applied, each implemented in a separate subcommand of
the yaw_cli script:
$ yaw_cli [subcommand]
Each subcommand provides an overview over its command line arguments, which can
be invoked by yaw_cli [subcommand] -h / yaw_cli [subcommand] --help. A
summary of these is provided in the sections below.
Execution order#
Many subcommands depend on outputs from previous steps, therefore subcommands should be called in a specific order:
Project setup:
init(always required)Counting pairs with
crossand/orauto(additionallyztrueon simulations)Removing cached data with
drop, estimating redshifts from pair counts withzccCreating check plots:
plot.
The order of commands in each of the groups above does not matter and except
init none of the steps above are required.
Note
If a subcommands finds no input data at all, a warning is issued and the process exits normally.
yaw_cli cross#
Description |
Responsible for computing crosscorrelations by counting pairs between
the reference and unknown samples in bins of redshift and storing the
counts. Since the main parameters are already configured with
The unknown sample is specifed by providing a single or multiple input
paths (e.g. to process tomographic bins) with Similarly, the random sample(s), one for each input catalogue in
Note If weights are provided, the total sum of weights in each subset are
stored in the special file |
Inputs |
Reference data (and random) sample, unknown data (and random) sample(s). |
Outputs |
Pair counts between reference and unknown sample(s). Stored per patch
and redshift bin as HDF5 files, one file for each unknown sample subset
and scale, at |
Depends on |
— |
Dependants |
|
Note
It is possible to provide redshift point estimates (--unk-z /
--rand-z), e.g. when using simulated data, however these are only
relevant for the auto and ztrue subcommands.
yaw_cli cross --help
usage: yaw_cli cross [-h] [-v] [--threads <int>] [--progress] [--rr]
--unk-path <file> [<file> ...] --unk-ra <str> --unk-dec
<str> [--unk-z <str>] [--unk-w <str>] [--unk-patch <str>]
[--unk-idx <int> [<int> ...]] [--unk-cache]
[--rand-path <file> [<file> ...]] [--rand-ra <str>]
[--rand-dec <str>] [--rand-z <str>] [--rand-w <str>]
[--rand-patch <str>] [--rand-idx <int> [<int> ...]]
[--rand-cache]
<directory>
Specify the unknown data sample(s) and optionally randoms. Measure the angular
cross-correlation function amplitude with the reference sample in bins of
redshift.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to
show debug messages
--threads <int> number of threads to use (default: from configuration)
--progress show a progress bar if the backend supports it
--rr compute random-random pair counts if both randoms are
available
unknown (data):
specify the unknown (data) input file
--unk-path <file> [<file> ...]
(list of) input file paths (e.g. if the data sample is
binned tomographically)
--unk-ra <str> column name of right ascension
--unk-dec <str> column name of declination
--unk-z <str> column name of redshift
--unk-w <str> column name of object weight
--unk-patch <str> column name of patch assignment index
--unk-idx <int> [<int> ...]
integer index to identify the input files (or bins)
provided with [--unk-path] (default: 1, 2, ...)
--unk-cache cache the data in the project's cache directory
unknown (random):
specify the unknown (random) input file (optional)
--rand-path <file> [<file> ...]
(list of) input file paths (e.g. if the data sample is
binned tomographically)
--rand-ra <str> column name of right ascension
--rand-dec <str> column name of declination
--rand-z <str> column name of redshift
--rand-w <str> column name of object weight
--rand-patch <str> column name of patch assignment index
--rand-idx <int> [<int> ...]
integer index to identify the input files (or bins)
provided with [--rand-path] (default: 1, 2, ...)
--rand-cache cache the data in the project's cache directory
yaw_cli auto#
Description |
Responsible for computing autocorrelations in bins of redshift by
counting pairs in the reference or unknown sample(s) and storing the
counts. This subcommand accepts just a few arguments, most importantly
|
Inputs |
Either reference data and random sample, or unknown data and random sample(s). |
Outputs |
Autocorrelation pair counts for the reference (and possibly unknown)
sample(s). Stored per patch and redshift bin as HDF5 files and for each
scale. When computing the reference sample autocorrelation, data is
stored at |
Depends on |
|
Dependants |
|
Note
When computing the unknown sample autocorrelation, --unk-z and
--rand-z must be provided when specifing the unknown sample with the
cross subcommand.
$ yaw_cli auto --help
usage: yaw_cli auto [-h] [-v] [--threads <int>] [--progress]
[--which {ref,unk}] [--no-rr]
<directory>
Measure the angular autocorrelation function amplitude of the reference
sample. Can be applied to the unknown sample if redshift point-estimates are
available.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to show
debug messages
--threads <int> number of threads to use (default: from configuration)
--progress show a progress bar if the backend supports it
--which {ref,unk} for which sample the autocorrelation should be computed
(default: ref, requires redshifts [--*-z] for data and
random sample)
--no-rr do not compute random-random pair counts
yaw_cli ztrue#
Description |
Computes histograms of the true redshift distribution of the unknown
sample(s) if a redshift column ( |
Inputs |
Unknown data sample(s). |
Outputs |
Histogram counts, samples and a covariance, stored as ASCII files with
file extensions |
Depends on |
|
Dependants |
|
$ yaw_cli ztrue --help
usage: yaw_cli ztrue [-h] [-v] [--threads <int>] [--progress] <directory>
Compute the redshift distributions of the unknown data sample(s), which
requires providing point-estimate redshifts for the catalog.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to show
debug messages
--threads <int> number of threads to use (default: from configuration)
--progress show a progress bar if the backend supports it
yaw_cli cache#
Print a summary of the data catalogues stored in the cache directory. When
providing the --drop flag, deletes the cached data catalogues.
Warning
After running yaw_cli cache --drop none of cross, auto, or
ztrue are available anymore if they require cataloges that have been
loaded using the --*-cache flags.
$ yaw_cli cache --help
usage: yaw_cli cache [-h] [-v] [--drop] <directory>
Get a summary of the project's cache directory (location, size, etc.) or
remove entries with --drop.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to show debug
messages
--drop drop all cache entries
yaw_cli zcc#
Description |
Converts pair counts to correlation function estimates for each measurement scale. Produces clustering redshift estimates and stores them as ASCII files. The outputs depend on the available inputs:
The command’s arguments specify the correlation estimator used to
convert pair counts to correlation functions. Other arguments specify
spatial resampling method used for uncertainty and covariance estiamtes.
By default, all autocorrelation function data is used for bias
mitigation. To omit correcting for the reference or unknown samples
biases, the flags Note The script can be run multiple times with different arguments. Each
run can be tagged using the |
Inputs |
Pair count files produced by |
Outputs |
Clustering redshift estimates, samples and a covariance, stored as ASCII
files with file extensions |
Depends on |
|
Dependants |
|
$ yaw_cli zcc --help
usage: yaw_cli zcc [-h] [-v] [--tag TAG] [--no-bias-ref] [--no-bias-unk]
[--est-cross {PH,DP,HM,LS}] [--est-auto {PH,DP,HM,LS}]
[--method {jackknife,bootstrap}] [--no-crosspatch]
[--n-boot <int>] [--global-norm] [--seed <int>]
<directory>
Compute clustering redshift estimates for the unknown data sample(s),
optionally mitigating galaxy bias estimated from any measured autocorrelation
function.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to
show debug messages
--tag TAG unique identifier for different configurations
(default: fid)
--no-bias-ref whether to mitigate the reference sample bias using
its autocorrelation function (if available)
--no-bias-unk whether to mitigate the unknown sample bias using its
autocorrelation functions (if available)
correlation estimators:
configure estimators for the different types of correlation functions
--est-cross {PH,DP,HM,LS}
correlation estimator for crosscorrelations (default:
LS or DP)
--est-auto {PH,DP,HM,LS}
correlation estimator for autocorrelations (default:
LS or DP)
resampling:
configure the resampling used for covariance estimates
--method {jackknife,bootstrap}
resampling method for covariance estimates (default:
jackknife)
--no-crosspatch whether to include cross-patch pair counts when
resampling
--n-boot <int> number of bootstrap samples (default: 500)
--global-norm normalise pair counts globally instead of patch-wise
--seed <int> random seed for bootstrap sample generation (default:
12345)
yaw_cli plot#
Description |
Generates automatic checkplots of the clustering redshift estimates and
sample autocorrelations as function of redshift. If available, adds the
measured true redshift distributions from |
Inputs |
Correlation function and clustering redshift estimates produced by
|
Outputs |
Check plots in the |
Depends on |
|
Dependants |
— |
$ yaw_cli plot --help
usage: yaw_cli plot [-h] [-v] <directory>
Plot the autocorrelations and redshift estimates into the 'estimate'
directory.
positional arguments:
<directory> project directory, must exist
options:
-h, --help show this help message and exit
-v, --verbose show additional information in terminal, repeat to show debug
messages