yaw.correlation.CorrData#
- class yaw.correlation.CorrData(binning: IntervalIndex, data: NDArray, samples: NDArray, method: str, info: str | None = None)[source]#
Bases:
SampledDataContainer class for sampled correlation function data.
Contains the redshift binning, correlation function amplitudes, and resampled amplitudes (e.g. jackknife or bootstrap). The resampled values are used to compute error estimates and covariance/correlation matrices. Provides some plotting methods for convenience.
The comparison, addition and subtraction and indexing rules are inherited from
SampledData, see some examples below.Examples
Create a new instance by sampling a correlation function:
>>> from yaw.examples import w_sp >>> data = w_sp.sample() # uses the default ResamplingConfig >>> data CorrData(n_bins=30, z='0.070...1.420', n_samples=64, method='jackknife')
View the data for a subset of the redshift bins:
>>> data.bins[5:9].data array([0.10158809, 0.08079947, 0.03876175, 0.02715336])
View the same subset as series:
>>> data.bins[5:9].get_data() (0.295, 0.34] 0.101588 (0.34, 0.385] 0.080799 (0.385, 0.43] 0.038762 (0.43, 0.475] 0.027153 dtype: float64
Get the redshift bin centers for these bins:
>>> data.bins[5:9].mids array([0.3175, 0.3625, 0.4075, 0.4525])
- Parameters:
binning (
pandas.IntervalIndex) – The redshift bin edges used for this correlation function.data (
NDArray) – The correlation function values.samples (
NDArray) – The resampled correlation function values.method (
str) – The resampling method used, seeResamplingConfigfor available options.info (
str, optional) – Descriptive text included in the headers of output files produced byCorrData.to_files().
Methods
__init__(binning, data, samples, method[, info])concatenate_bins(*data)Concatenate pair count data containers with equal patches.
from_files(path_prefix)Create a new instance by loading the data from ASCII files.
Get the underlying, exact redshift bin intervals.
Get value correlation matrix as data frame with its corresponding redshift bin intervals as index and column labels.
Get value covariance matrix as data frame with its corresponding redshift bin intervals as index and column labels.
get_data()Get the data as
pandas.Serieswith the binning as index.Get value error estimate (diagonal of covariance matrix) as series with its corresponding redshift bin intervals as index.
Get the data as
pandas.DataFramewith the binning as index.is_compatible(other[, require])Check whether this instance is compatible with another instance.
plot(*[, color, label, error_bars, ax, ...])Create a plot of the correlation data as a function of redshift.
plot_corr(*[, redshift, cmap, ax])Plot the correlation matrix of the data.
to_files(path_prefix)Store the data in a set of ASCII files on disk.
Attributes
An
Indexerattribute that supports iteration over the bins or selecting a subset of the bins.Specifies on which side the redshift bin intervals are closed, can be:
left,right,both,neither.Get the width of the redshift bins as array.
Get the edges of the redshift bins as flat array.
The uncertainty (standard error) of the data.
Optional descriptive text for the contained data.
Get the centers of the redshift bins as array.
Get the number of redshift bins.
Number of samples used for error estimate.
The redshift bin intervals.
The data values, one for each redshift bin.
Samples of the data values, shape (# samples, # bins).
The resampling method used.
Covariance matrix automatically computed from the resampled values.
- binning: IntervalIndex#
The redshift bin intervals.
- property bins: Indexer[int | slice | Sequence, _Tdata]#
An
Indexerattribute that supports iteration over the bins or selecting a subset of the bins.The indexer always returns new container instances with the indexed data subset or the current item when iterating.
Warning
Indexing rules for a one-dimensional numpy array apply, however if the resulting binning is not contiguous or contains repeated bins, some operations on the returned container may fail.
- Returns:
yaw.core.containers.Indexer
- property closed: str#
Specifies on which side the redshift bin intervals are closed, can be:
left,right,both,neither.
- concatenate_bins(*data: _Tdata) _Tdata#
Concatenate pair count data containers with equal patches.
The data is merged by appending the data along the redshift binning axis.
Note
Necessary condition for merging is that the patch numbers are identical and that the merged binning is contiguous and non-overlapping. Cannot merge cross- with autocorrelation containers.
- Parameters:
*data – Containers of same type that are appended to the patch dimension of this container.
- Returns:
New instance of this container with combined data.
- covariance: NDArray#
Covariance matrix automatically computed from the resampled values.
- data: NDArray#
The data values, one for each redshift bin.
- property dz: ndarray[Any, dtype[float64]]#
Get the width of the redshift bins as array.
- property edges: ndarray[Any, dtype[float64]]#
Get the edges of the redshift bins as flat array.
- property error: NDArray#
The uncertainty (standard error) of the data.
- Returns:
NDArray
- classmethod from_files(path_prefix: Path | str) _Tdata[source]#
Create a new instance by loading the data from ASCII files.
The data is restored from a set of three input files produced by
to_files().Note
These file have the same names but different file extension, therefore only provide the base name without any extension to specifiy the input files.
- Parameters:
path_prefix (
str) – The base name of the input files without any file extension.- Returns:
- get_binning() IntervalIndex#
Get the underlying, exact redshift bin intervals.
- Returns:
pandas.IntervalIndex
- get_correlation() DataFrame#
Get value correlation matrix as data frame with its corresponding redshift bin intervals as index and column labels.
- Returns:
pandas.DataFrame
- get_covariance() DataFrame#
Get value covariance matrix as data frame with its corresponding redshift bin intervals as index and column labels.
- Returns:
pandas.DataFrame
- get_data() Series#
Get the data as
pandas.Serieswith the binning as index.
- get_error() Series#
Get value error estimate (diagonal of covariance matrix) as series with its corresponding redshift bin intervals as index.
- Returns:
pandas.Series
- get_samples() DataFrame#
Get the data as
pandas.DataFramewith the binning as index. The columns are labelled numerically and each represent one of the samples.
- info: str | None = None#
Optional descriptive text for the contained data.
- is_compatible(other: SampledData, require: bool = False) bool#
Check whether this instance is compatible with another instance.
Ensures that both objects are instances of the same class, that the redshift binning is identical, that the number of samples agree, and that the resampling method is identical.
- Parameters:
other (
BinnedQuantity) – Object instance to compare to.require (
bool, optional) – Raise a ValueError if any of the checks fail.
- Returns:
bool
- method: str#
The resampling method used.
- property mids: ndarray[Any, dtype[float64]]#
Get the centers of the redshift bins as array.
- property n_bins: int#
Get the number of redshift bins.
- property n_samples: int#
Number of samples used for error estimate.
- plot(*, color: str | NDArray | None = None, label: str | None = None, error_bars: bool = True, ax: Axis | None = None, xoffset: float = 0.0, plot_kwargs: dict[str, Any] | None = None, zero_line: bool = False, scale_by_dz: bool = False) Axis[source]#
Create a plot of the correlation data as a function of redshift.
Create a new axis or plot to an existing one, add x-axis offsets, if plotting multiple instances, or specify if the values should be represented as points with errorbars (default) or as line plot with shaded area to represent uncertainties.
- Parameters:
color – Valid
matplotlibcolor used for the error bars or the line and the shaded uncertainty area.label (
str, optional) – Plot label for the legend.error_bars (
bool, optional) – Whether to plot error bars (the default) or a line plot with shaded area.ax (plot axis, optional) – Optional
matplotlibaxis to plot into.xoffset (
int, optional) – Shift to apply to the x-axis (redshift) values.plot_kwargs (
dict, optional) – Parameters passed to theerrobar()orplot()plotting functions.zero_lilne (
bool, optional) – Wether to draw a thin black line that indicatesy=0.scale_by_dz (
bool, optional) – Whether to multiply the y-values by the redshift bin widthdz.
- plot_corr(*, redshift: bool = False, cmap: str = 'RdBu_r', ax: Axis | None = None) Axis[source]#
Plot the correlation matrix of the data.
Create a new axis or plot to an existing one.
- Parameters:
redshift (
bool, optional) – Whether to map the matrix onto redshifts or as regular matrix plot (the default).cmap (
str, optional) – Name of amatplotlibcolormap to use.ax (plot axis, optional) – Optional
matplotlibaxis to plot into.
- samples: NDArray#
Samples of the data values, shape (# samples, # bins).
- to_files(path_prefix: Path | str) None[source]#
Store the data in a set of ASCII files on disk.
These files can be loaded with the
from_files()method. There are three files with the same name but different file extension.Files
[path_prefix].dat: Contains the redshift bin edges, the data values and their standard error. Additionally there is information about the error estimate and theinfoattribute.[path_prefix].smp: Contains one row for each redshift bin. The first two columns list the lower and upper edge of the redshift bin, the remaining columns list the values of the samples, i.e. there areN+2columns. Additionally contains theinfoattribute.[path_prefix].cov: Contains the covariance matrix and additionally theinfoattribute.- Parameters:
path_prefix (
str) – The base name of the output files without any file extension.