yaw.core.SampledData#
- class yaw.core.SampledData(binning: IntervalIndex, data: NDArray, samples: NDArray, method: str)[source]#
Bases:
BinnedQuantityContainer for data and resampled data with redshift binning.
Contains the redshift binning, data vector, and resampled data vector (e.g. jackknife or bootstrap samples). The resampled values are used to compute error estimates and covariance/correlation matrices.
- Parameters:
binning (
pandas.IntervalIndex) – The redshift binning applied to the data.data (
NDArray) – The data values, one for each redshift bin.samples (
NDArray) – The resampled data values (e.g. jackknife or bootstrap samples).method (
str) – The resampling method used, seeResamplingConfigfor available options.
The container supports addition and subtraction, which return a new instance of the container, holding the modified data. This requires that both operands are compatible (same binning and same sampling). The operands are applied to the
dataandsamplesattribtes.Furthermore, the container supports indexing and iteration over the redshift bins using the
SampledData.binsattribute. This attribute yields instances ofSampledDatacontaining a single bin when iterating. Slicing and indexing follows the same rules as the underlyingdataNDArray. Refer toCorrDatafor some indexing and iteration examples.Examples
Create a redshift binning:
>>> import pandas as pd >>> bins = pd.IntervalIndex.from_breaks([0.1, 0.2, 0.3]) >>> bins IntervalIndex([(0.1, 0.2], (0.2, 0.3]], dtype='interval[float64, right]')
Create some sample data for the bins with value 1 and five assumed jackknife samples normal-distributed around 1.
>>> import numpy as np >>> n_bins, n_samples = len(bins), 5 >>> data = np.ones(n_bins) >>> samples = np.random.normal(1.0, size=(n_samples, n_bins))
Create the container:
>>> values = yaw.core.SampledData(bins, data, samples, method="jackknife") >>> values SampledData(n_bins=2, z='0.100...0.300', n_samples=10, method='jackknife')
Add the container to itself and verify that the values are doubled:
>>> summed = values + values >>> summed.data array([2., 2.])
The same applies to the samples:
>>> summed.samples / values.samples array([[2., 2.], [2., 2.], [2., 2.], [2., 2.], [2., 2.]])
Methods
__init__(binning, data, samples, method)concatenate_bins(*data)Concatenate pair count data containers with equal patches.
Get the underlying, exact redshift bin intervals.
Get value correlation matrix as data frame with its corresponding redshift bin intervals as index and column labels.
Get value covariance matrix as data frame with its corresponding redshift bin intervals as index and column labels.
get_data()Get the data as
pandas.Serieswith the binning as index.Get value error estimate (diagonal of covariance matrix) as series with its corresponding redshift bin intervals as index.
Get the data as
pandas.DataFramewith the binning as index.is_compatible(other[, require])Check whether this instance is compatible with another instance.
Attributes
An
Indexerattribute that supports iteration over the bins or selecting a subset of the bins.Specifies on which side the redshift bin intervals are closed, can be:
left,right,both,neither.Get the width of the redshift bins as array.
Get the edges of the redshift bins as flat array.
The uncertainty (standard error) of the data.
Get the centers of the redshift bins as array.
Get the number of redshift bins.
Number of samples used for error estimate.
The redshift bin intervals.
The data values, one for each redshift bin.
Samples of the data values, shape (# samples, # bins).
The resampling method used.
Covariance matrix automatically computed from the resampled values.
- binning: IntervalIndex#
The redshift bin intervals.
- property bins: Indexer[int | slice | Sequence, _Tdata]#
An
Indexerattribute that supports iteration over the bins or selecting a subset of the bins.The indexer always returns new container instances with the indexed data subset or the current item when iterating.
Warning
Indexing rules for a one-dimensional numpy array apply, however if the resulting binning is not contiguous or contains repeated bins, some operations on the returned container may fail.
- Returns:
yaw.core.containers.Indexer
- property closed: str#
Specifies on which side the redshift bin intervals are closed, can be:
left,right,both,neither.
- concatenate_bins(*data: _Tdata) _Tdata[source]#
Concatenate pair count data containers with equal patches.
The data is merged by appending the data along the redshift binning axis.
Note
Necessary condition for merging is that the patch numbers are identical and that the merged binning is contiguous and non-overlapping. Cannot merge cross- with autocorrelation containers.
- Parameters:
*data – Containers of same type that are appended to the patch dimension of this container.
- Returns:
New instance of this container with combined data.
- covariance: NDArray#
Covariance matrix automatically computed from the resampled values.
- data: NDArray#
The data values, one for each redshift bin.
- property dz: ndarray[Any, dtype[float64]]#
Get the width of the redshift bins as array.
- property edges: ndarray[Any, dtype[float64]]#
Get the edges of the redshift bins as flat array.
- property error: NDArray#
The uncertainty (standard error) of the data.
- Returns:
NDArray
- get_binning() IntervalIndex[source]#
Get the underlying, exact redshift bin intervals.
- Returns:
pandas.IntervalIndex
- get_correlation() DataFrame[source]#
Get value correlation matrix as data frame with its corresponding redshift bin intervals as index and column labels.
- Returns:
pandas.DataFrame
- get_covariance() DataFrame[source]#
Get value covariance matrix as data frame with its corresponding redshift bin intervals as index and column labels.
- Returns:
pandas.DataFrame
- get_error() Series[source]#
Get value error estimate (diagonal of covariance matrix) as series with its corresponding redshift bin intervals as index.
- Returns:
pandas.Series
- get_samples() DataFrame[source]#
Get the data as
pandas.DataFramewith the binning as index. The columns are labelled numerically and each represent one of the samples.
- is_compatible(other: SampledData, require: bool = False) bool[source]#
Check whether this instance is compatible with another instance.
Ensures that both objects are instances of the same class, that the redshift binning is identical, that the number of samples agree, and that the resampling method is identical.
- Parameters:
other (
BinnedQuantity) – Object instance to compare to.require (
bool, optional) – Raise a ValueError if any of the checks fail.
- Returns:
bool
- method: str#
The resampling method used.
- property mids: ndarray[Any, dtype[float64]]#
Get the centers of the redshift bins as array.
- property n_bins: int#
Get the number of redshift bins.
- property n_samples: int#
Number of samples used for error estimate.
- samples: NDArray#
Samples of the data values, shape (# samples, # bins).