yaw.correlation.CorrFunc#

class yaw.correlation.CorrFunc(dd: NormalisedCounts, dr: NormalisedCounts | None = None, rd: NormalisedCounts | None = None, rr: NormalisedCounts | None = None)[source]#

Bases: PatchedQuantity, BinnedQuantity, HDFSerializable

Container object for measured correlation pair counts.

Container returned by correlate() that computes the correlations between data catalogs. The correlation function can be computed from four kinds of pair counts, data-data (DD), data-random (DR), random-data (RD), and random-random (RR).

Note

DD is always required, but DR, RD, and RR are optional as long as at least one is provided.

Provides methods to read and write data to disk and compute the actual correlation function values (see CorrData) using spatial resampling (see ResamplingConfig).

The container supports comparison with == and != on the pair count level. The supported arithmetic operations between two correlation functions, addition and subtraction, are applied between all internally stored pair counts data. The same applies to rescaling of the counts by a scalar, see some examples below.

Examples

Create a new instance by sampling a correlation function:

>>> from yaw.examples import w_sp
>>> dd, dr = w_sp.dd, w_sp.dr  # get example data-data and data-rand counts
>>> corr = yaw.CorrFunc(dd=dd, dr=dr)
>>> corr
CorrFunc(n_bins=30, z='0.070...1.420', dd=True, dr=True, rd=False, rr=False, n_patches=64)

Access the pair counts:

>>> corr.dd
NormalisedCounts(n_bins=30, z='0.070...1.420', n_patches=64)

Check if it is an autocorrelation function measurement:

>>> corr.auto
False

Check which pair counts are available to compute the correlation function:

>>> corr.estimators
{'DP': yaw.correlation.estimators.DavisPeebles}

Sample the correlation function

>>> corr.sample()  # uses the default ResamplingConfig
CorrData(n_bins=30, z='0.070...1.420', n_samples=64, method='jackknife')

Note how the indicated shape changes when a patch subset is selected:

>>> corr.patches[:10]
CorrFunc(n_bins=30, z='0.070...1.420', dd=True, dr=True, rd=False, rr=False, n_patches=10)

Note how the indicated redshift range and shape change when a bin subset is selected:

>>> corr.bins[:3]
CorrFunc(n_bins=3, z='0.070...0.205', dd=True, dr=True, rd=False, rr=False, n_patches=64)

Parameters:

dd (NormalisedCounts) – Pair counts from a data-data count measurement.
dr (NormalisedCounts, optional) – Pair counts from a data-random count measurement.
rd (NormalisedCounts, optional) – Pair counts from a random-data count measurement.
rr (NormalisedCounts, optional) – Pair counts from a random-random count measurement.

Methods

`__init__`(dd[, dr, rd, rr])
`concatenate_bins`(*cfs)	Concatenate pair count data containers with equal patches.
`concatenate_patches`(*cfs)	Concatenate pair count data containers with equal redshift binning.
`from_file`(path)	Create a class instance by deserialising data from a HDF5 file.
`from_hdf`(source)	Create a class instance by deserialising data from a HDF5 group.
`get`(args, *kwargs)
`get_binning`()	Get the underlying, exact redshift bin intervals.
`is_compatible`(other[, require])	Check whether this instance is compatible with another instance.
`sample`([config, estimator, info])	Compute the correlation function from the stored pair counts, including an error estimate from spatial resampling of patches.
`to_file`(path)	Serialise the class instance to a new HDF5 file.
`to_hdf`(dest)	Serialise the class instance into an existing HDF5 group.

Attributes

`auto`	Whether the stored data are from an autocorrelation measurement.
`bins`	An `Indexer` attribute that supports iteration over the bins or selecting a subset of the bins.
`closed`	Specifies on which side the redshift bin intervals are closed, can be: `left`, `right`, `both`, `neither`.
`dr`	Pair counts from a data-random count measurement.
`dz`	Get the width of the redshift bins as array.
`edges`	Get the edges of the redshift bins as flat array.
`estimators`	Get a listing of correlation estimators implemented, depending on which pair counts are available.
`mids`	Get the centers of the redshift bins as array.
`n_bins`	Get the number of redshift bins.
`n_patches`	Get the number of spatial patches.
`patches`	An `Indexer` attribute that supports iteration over the spatial patches or selecting a subset of the patches.
`rd`	Pair counts from a random-data count measurement.
`rr`	Pair counts from a random-random count measurement.
`dd`	Pair counts for a data-data correlation measurement

property auto: bool#: Whether the stored data are from an autocorrelation measurement.

property bins: Indexer[int | slice | Sequence, CorrFunc]#

An Indexer attribute that supports iteration over the bins or selecting a subset of the bins.

The indexer always returns new container instances with the indexed data subset or the current item when iterating.

Warning

Indexing rules for a one-dimensional numpy array apply, however if the resulting binning is not contiguous or contains repeated bins, some operations on the returned container may fail.

Returns:: yaw.core.containers.Indexer

property closed: str#: Specifies on which side the redshift bin intervals are closed, can be: left, right, both, neither.

concatenate_bins(*cfs: CorrFunc) → CorrFunc[source]#

Concatenate pair count data containers with equal patches.

The data is merged by appending the data along the redshift binning axis.

Note

Necessary condition for merging is that the patch numbers are identical and that the merged binning is contiguous and non-overlapping. Cannot merge cross- with autocorrelation containers.

Parameters:: *data – Containers of same type that are appended to the patch dimension of this container.
Returns:: New instance of this container with combined data.

concatenate_patches(*cfs: CorrFunc) → CorrFunc[source]#

Concatenate pair count data containers with equal redshift binning.

The data is merged by extending the dimension of the patch axes. The resulting data array will be a block matrix of the input data arrays, i.e. all elements with correlations between different inputs set to zero.

Note

Necessary condition for merging is that the the redshift binning of all inputs is identical. Cannot merge cross- with autocorrelation containers.

Parameters:: *data – Containers of same type that are appended to the patch dimension of this container.
Returns:: New instance of this container with combined data.

dd: NormalisedCounts#: Pair counts for a data-data correlation measurement

dr: NormalisedCounts | None = None#: Pair counts from a data-random count measurement.

property dz: ndarray[Any, dtype[float64]]#: Get the width of the redshift bins as array.

property edges: ndarray[Any, dtype[float64]]#: Get the edges of the redshift bins as flat array.

property estimators: dict[str, CorrelationEstimator]#

Get a listing of correlation estimators implemented, depending on which pair counts are available.

Returns:: Mapping from correlation estimator name abbreviation to correlation function class.
Return type:: dict

classmethod from_file(path: Path | str) → CorrFunc[source]#

Create a class instance by deserialising data from a HDF5 file.

Parameters:: path (pathlib.Path, str) – Group in an opened HDF5 file that contains the necessary data.
Returns:: HDFSerializable

classmethod from_hdf(source: File | Group) → CorrFunc[source]#

Create a class instance by deserialising data from a HDF5 group.

Parameters:: source (h5py.Group) – Group in an opened HDF5 file that contains the serialised data.
Returns:: HDFSerializablep

get(*args, **kwargs)[source]#: Deprecated since version 2.3.1: Renamed to sample().

get_binning() → IntervalIndex[source]#

Get the underlying, exact redshift bin intervals.

Returns:: pandas.IntervalIndex

is_compatible(other: CorrFunc, require: bool = False) → bool[source]#

Check whether this instance is compatible with another instance.

Ensures that the redshift binning and the number of patches are identical.

Parameters:

other (BinnedQuantity) – Object instance to compare to.
require (bool) – Raise a ValueError if any of the checks fail.

Returns:

bool

property mids: ndarray[Any, dtype[float64]]#: Get the centers of the redshift bins as array.

property n_bins: int#: Get the number of redshift bins.

property n_patches: int#: Get the number of spatial patches.

property patches: Indexer[int | slice | Sequence, CorrFunc]#

An Indexer attribute that supports iteration over the spatial patches or selecting a subset of the patches.

The indexer always returns new container instances with the indexed data subset or the current item when iterating.

Note

Indexing rules for a one-dimensional numpy array apply.

Returns:: yaw.core.containers.Indexer

rd: NormalisedCounts | None = None#: Pair counts from a random-data count measurement.

rr: NormalisedCounts | None = None#: Pair counts from a random-random count measurement.

sample(config: ResamplingConfig | None = None, *, estimator: str | None = None, info: str | None = None) → CorrData[source]#

Compute the correlation function from the stored pair counts, including an error estimate from spatial resampling of patches.

Parameters:

config (ResamplingConfig) – Specify the resampling method and its configuration.

Keyword Arguments:

estimator (str, optional) – The name abbreviation for the correlation estimator to use. Defaults to Landy-Szalay if RR is available, otherwise to Davis-Peebles.
info (str, optional) – Descriptive text passed on to the output CorrData object.

Returns:

Correlation function data, including redshift binning, function values and samples.

Return type:

CorrData

to_file(path: Path | str) → None[source]#

Serialise the class instance to a new HDF5 file.

Parameters:: path (pathlib.Path, str) – Path at which the HDF5 file is created.

to_hdf(dest: File | Group) → None[source]#

Serialise the class instance into an existing HDF5 group.

Parameters:: dest (h5py.Group) – Group in which the serialised data structures are created.