# Copyright French Alternative Energies and Atomic Energy Commission
# Contributors: valjean developers
# valjean-support@cea.fr
#
# This software is a computer program whose purpose is to analyze and
# post-process numerical simulation results.
#
# This software is governed by the CeCILL license under French law and abiding
# by the rules of distribution of free software. You can use, modify and/ or
# redistribute the software under the terms of the CeCILL license as circulated
# by CEA, CNRS and INRIA at the following URL: http://www.cecill.info.
#
# As a counterpart to the access to the source code and rights to copy, modify
# and redistribute granted by the license, users are provided only with a
# limited warranty and the software's author, the holder of the economic
# rights, and the successive licensors have only limited liability.
#
# In this respect, the user's attention is drawn to the risks associated with
# loading, using, modifying and/or developing or reproducing the software by
# the user in light of its specific status of free software, that may mean that
# it is complicated to manipulate, and that also therefore means that it is
# reserved for developers and experienced professionals having in-depth
# computer knowledge. Users are therefore encouraged to load and test the
# software's suitability as regards their requirements in conditions enabling
# the security of their systems and/or data to be ensured and, more generally,
# to use and operate it in the same conditions as regards security.
#
# The fact that you are presently reading this means that you have had
# knowledge of the CeCILL license and that you accept its terms.
r'''
In our case we'll use the modified χ²-test widely used in physics for comparing
histograms. This test is also available in the **ROOT**
`software <https://root.cern.ch/>`_. For further explanations a good desciption
is provided by the documentation of ROOT's ``Chi2Test()`` method on
`TH1 page <https://root.cern.ch/doc/master/classTH1.html>`_ or in this
`article <https://arxiv.org/abs/physics/0605123>`_ by N. Gagunashvili.
The used formula is then:
.. math::
\chi^2_{obs} = \sum_{i=1}^p\frac{(x_{1i} - x_{2i})^2}
{\sigma_{1i}^2 + \sigma_{2i}^2}
where datasets (or distributions) 1 and 2 are compared, :math:`x_i` being the
values and :math:`\sigma_i` being the error in each bin.
In our case only the 'weighted' case is implemented.
Statistical interpretation
``````````````````````````
The null hypothesis for the χ²-test is:
* equal means in each bin
* gaussian means in each bin (so normal distribution in each bin)
* bins are independent
The interpretation of the test is based on the χ² value and on the p-value.
Usually it is expected that the χ²/ndf should be close to 1 (ndf = number of
degrees of freedom), see `PDG, statistics chapter
<https://pdg.lbl.gov/2018/reviews/rpp2018-rev-statistics.pdf>`_. The p-value
can also be calculated from the probability density function of the χ²
distribution, which is given by:
.. math::
pdf(x,k) = \frac{1}{2^{k/2}\Gamma(k/2)}x^{k/2-1}e^{-x/2}
for :math:`x > 0`, :math:`k` the number of degrees of freedom and
:math:`\Gamma` is Euler's Gamma function [#GammaFunc]_.
The cumulative distribution function, :math:`cdf`, used to calculated the
p-value is then:
.. math::
cdf(x,k) = \frac{\gamma(k/2,x/2)}{\Gamma(k/2)}
= \int_{-\infty}^{x}pdf(t,k)dt
where :math:`\gamma` is the lower incomplete gamma function
[#partialGammaFunc]_.
It is equivalent to calculate the probability :math:`P(x < \chi^2_{obs})` which
is equal to :math:`1 - cdf(x,k)`, so the integral from :math:`x` to
:math:`+\infty`. This probability is the p-value.
If the p-value is lower than the define level (1%, 5% for example), the null
hypothesis is rejected.
Possible biases can come from wrong estimation of uncertainties
(underestimated for example or only statistical while a systematic one
should be taken into account).
Code implementation
```````````````````
**SciPy** is used to calculate the p-value. The distribution
:obj:`scipy.stats.chi2` is imported as `schi2`` in order to name conflicts. The
p-value is obtained with the ``sf`` method. It is directly compared with the
significance level required by the user.
.. warning::
If some of your bins are not filled, if they are initialized to zero or if
their error is zero, the χ² calculation will be wrong and probably give
:math:`\infty`.
One option is proposed to remove these bins from the calculation,
``ignore_empty``. The use of this option implies a reduction of the number
of degrees of freedom by the number of removed bins (ndf = # bins). It has
to be considered with caution, this does not make necessarly sense,
depending on the user case.
Another option is to apply some rebinning in order to have no empty bins
(what usually means re-running to get consistent uncertainties).
Examples of the test
````````````````````
Let's consider a spectrum of 5 bins with their error and apply the χ²-test.
>>> from valjean.eponine.dataset import Dataset
>>> from valjean.gavroche.stat_tests.chi2 import TestChi2
>>> import numpy as np
>>> ds1 = Dataset(np.array([5.2, 5.3, 5.25, 5.4, 5.5]),
... np.array([0.2, 0.25, 0.1, 0.2, 0.3]))
>>> ds2 = Dataset(np.array([5.1, 5.6, 5.2, 5.3, 5.2]),
... np.array([0.1, 0.3, 0.05, 0.4, 0.3]))
>>> tchi2 = TestChi2(ds1, ds2, alpha=0.05, name="comp",
... description="Comparison using Chi2 test")
>>> tchi2_res = tchi2.evaluate()
>>> print(f'{tchi2_res.chi2_per_ndf[0]:.7f}')
0.3080328
>>> bool(tchi2_res)
True
>>> print(f'{tchi2_res.pvalue[0]:.7f}')
0.9083889
Test with empty bins
^^^^^^^^^^^^^^^^^^^^
One "empty bin" at the same position in the 2 compared datasets prevents the
test to work. An empty bin is defined as a bin with zero error. In Monte
Carlo codes a zero error can happen for example when only one batch has been
run, no variance can be calculated in that case (``n-1`` usually used with
``n`` the number of batches).
>>> ds3 = Dataset(np.array([5.2, 5.3, 5.25, 5.4, 5.5]),
... np.array([0.2, 0.25, 0., 0.2, 0.3]))
>>> ds4 = Dataset(np.array([5.1, 5.6, 5.2, 5.3, 5.2]),
... np.array([0.1, 0.3, 0., 0.4, 0.3]))
>>> tchi2 = TestChi2(ds3, ds4, alpha=0.05, name="comp",
... description="Comparison using Chi2 test")
>>> tchi2_res = tchi2.evaluate()
>>> tchi2_res.chi2
[inf]
>>> bool(tchi2_res)
False
>>> print(np.array2string(tchi2_res.test.nonzero_bins[0]))
[ True True True True True]
>>> tchi2_res.test.ndf[0]
5
>>> np.count_nonzero(ds4.error)
4
If the values are also at zero, we get a ``nan`` instead of an ``inf``.
To get a χ² evaluation based on the non-empty bins, use the ``ignore_empty``
argument of the :class:`TestChi2`. It recalculates the number of degrees of
freedom removing the empty bins and only use the non-zero ones in the χ²
calculation.
>>> tchi2 = TestChi2(ds3, ds4, alpha=0.05, ignore_empty=True,
... name="comp", description="Comparison using Chi2 test")
>>> tchi2_res = tchi2.evaluate()
>>> print(f'{tchi2_res.chi2[0]:.7f}')
1.3401639
>>> tchi2_res.test.ndf[0]
4
>>> tchi2_res.test.dsref.size
5
>>> print(np.array2string(tchi2_res.test.nonzero_bins[0]))
[ True True False True True]
>>> print(f'{tchi2_res.chi2_per_ndf[0]:.7f}')
0.3350410
>>> bool(tchi2_res)
True
Test with multiple dimensions datasets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> ds5 = Dataset(np.array([[5.2, 5.3, 5.25], [5.4, 5.5, 5.2]]),
... np.array([[0.2, 0.25, 0.1], [0.2, 0.3, 0.1]]))
>>> ds6 = Dataset(np.array([[5.1, 5.6, 5.2], [5.3, 5.2, 5.3]]),
... np.array([[0.1, 0.3, 0.05], [0.4, 0.3, 0.2]]))
>>> ds5.shape
(2, 3)
>>> ds5.size
6
>>> tchi2 = TestChi2(ds5, ds6, alpha=0.05, name="comp",
... description="Comparison using Chi2 test")
>>> tchi2_res = tchi2.evaluate()
>>> print(f'{tchi2_res.chi2_per_ndf[0]:.7f}')
0.2900273
>>> bool(tchi2_res)
True
>>> print(f'{tchi2_res.pvalue[0]:.7f}')
0.9419786
And when we have at least one empty bin, using the ``ignore_empty`` argument:
>>> ds7 = Dataset(np.array([[5.2, 5.3, 5.25], [5.4, 5.5, 5.2]]),
... np.array([[0.2, 0.25, 0.], [0.2, 0.3, 0.1]]))
>>> ds8 = Dataset(np.array([[5.1, 5.6, 5.2], [5.3, 5.2, 5.4]]),
... np.array([[0.1, 0.3, 0.], [0.4, 0.3, 0.]]))
>>> ds7.shape
(2, 3)
>>> ds7.size
6
>>> tchi2 = TestChi2(ds7, ds8, alpha=0.05, ignore_empty=True,
... name="comp", description="Comparison using Chi2 test")
>>> tchi2_res = tchi2.evaluate()
>>> print(f'{tchi2_res.chi2[0]:.7f}')
5.3401639
>>> tchi2_res.test.ndf[0]
5
>>> tchi2_res.test.dsref.size
6
>>> print(np.array2string(tchi2_res.test.nonzero_bins[0]))
[[ True True False]
[ True True True]]
>>> print(f'{tchi2_res.chi2_per_ndf[0]:.7f}')
1.0680328
>>> bool(tchi2_res)
True
.. note::
A RuntimeWarning is emitted if zero bins are used during calculation.
.. rubric:: Footnotes
.. [#GammaFunc] The Γ function is given by
.. math::
\Gamma(s) = \int_0^\infty t^{s-1}e^{-t}dt = (s-1)!
It also has an incomplete form (not starting at 0):
.. math::
\Gamma(s, x) = \int_x^\infty t^{s-1}e^{-t}dt
.. [#partialGammaFunc] The γ function is given by:
.. math::
\gamma(s, x) = \int_0^x t^{s-1}e^{-t}dt
'''
import numpy as np
from scipy.stats import chi2 as schi2
from ..test import TestDataset, TestResult
[docs]
class TestResultChi2(TestResult):
'''Result from the χ²-test.'''
[docs]
def __init__(self, test, chi2, pvalue):
'''Construct a TestResultChi2 object.
Members are lists which length corresponds to the number of datasets
compared to the reference dataset.
:param test: the χ² test object
:type test: :class:`~.TestChi2`
:param chi2: the value of the χ² statistic
:type chi2: :class:`list` (:obj:`numpy.generic` (:obj:`float`))
:param pvalue: the test p-value
:type pvalue: :class:`list` (:obj:`numpy.generic` (:obj:`float`))
'''
super().__init__(test)
self.chi2 = chi2
self.pvalue = pvalue
[docs]
def oracles(self):
'''Final test for the list of compared datasets.
:returns: :obj:`numpy.ndarray`
'''
return np.greater(self.pvalue, self.test.alpha)
def __bool__(self):
return all(self.oracles())
@property
def chi2_per_ndf(self):
'''Calculate the χ² per number of degrees of freedom.
:returns: χ²/ndf
:rtype: float
No parenthesis needed: this is a property.
'''
return [c2 / df for c2, df in zip(self.chi2, self.test.ndf)]
[docs]
class TestChi2(TestDataset):
'''Test class for χ², inheritate from :class:`~valjean.gavroche.test.Test`.
'''
[docs]
def __init__(self, dsref, *datasets, name, description='', labels=None,
alpha=0.01, ignore_empty=False):
# pylint: disable=too-many-arguments
'''Initialisation of :class:`TestChi2`
:param str name: local name of the test
:param str description: specific description of the test
:param dict labels: labels to be used for test classification in
reports (for example category, input file name,
type of result, ...)
:param dsref: reference dataset
:type dsref: :class:`~valjean.eponine.dataset.Dataset`
:param datasets: list of datasets to be compared to reference dataset
:type datasets: :class:`list` (:class:`~.dataset.Dataset`)
:param float alpha: probability to accept of not the test (p-value is
expected greater), i.e. significance level
:param bool ignore_empty: if the array contains zero bins, the test can
done not consider them if this option is
True; otherwise it will probably fail as the
array to sum will contain infinite terms.
Default is ``False``.
'''
super().__init__(dsref, *datasets,
name=name, description=description, labels=labels)
self.alpha = alpha
self.ignore_empty = ignore_empty
#: nonzero bins identification by True, False if zero
#: (:class:`list` (:obj:`numpy.ndarray` (:obj:`bool`)))
self.nonzero_bins = self._nonzero_bins()
#: number of degrees of freedom (:class:`list` (:obj:`int`))
self.ndf = [np.count_nonzero(nzb) for nzb in self.nonzero_bins]
def _nonzero_bins(self):
'''Identify nonzero bins.
A nonzero bin is considered as a bin where at least one of the datasets
has a nonzero error. Consequently if both bins have zero error this is
a zero bin. This identification is stored in ``nonzero_bins``: ``True``
for nonzero, else ``False``. This :obj:`numpy.ndarray` is used in the
χ²-test calculation.
:returns: :obj:`numpy.ndarray` (:obj:`bool`)
'''
if self.ignore_empty:
return [np.logical_or(self.dsref.error > 0, _ds.error > 0)
for _ds in self.datasets]
return [np.full_like(self.dsref.value, True, dtype=bool)
for _ in range(len(self.datasets))]
[docs]
@staticmethod
def pvalue(chi2, ndf):
'''Calculation of the p-value of the test.
:param chi2: observed χ²
:type chi2: :obj:`numpy.generic` (:obj:`float`)
:param ndf: number of degrees of freedom
:type ndf: int or :obj:`numpy.generic` (:obj:`int`)
:returns: :obj:`numpy.generic` (:obj:`float`)
'''
return schi2.sf(chi2, ndf)
[docs]
@staticmethod
def chi2_test(ds1, ds2, nonzero_bins=None):
r'''Compute the χ² value for the given datasets.
:param ds1: first dataset
:type ds1: :class:`~valjean.eponine.dataset.Dataset`
:param ds2: second dataset
:type ds2: :class:`~valjean.eponine.dataset.Dataset`
:param nonzero_bins: optional argument, booleans array of the same size
as ``ds1`` and ``ds2`` to identify zero bins and
possibly avoid them (see below)
:type nonzero_bins: :obj:`numpy.ndarray` (:obj:`bool`) or ``None``
(default)
:returns: χ² value
:rtype: :obj:`numpy.generic` (:obj:`float`)
It has to be noted that if in a bin the sum of the variances of the
test and the reference dataset is zero (so both datasets bin has a zero
error) the bin cannot be taken into account in the calculation of the
sum as the ratio will be infinite. This is why they can be removed via
the ``nonzero_bins`` argument at that step. This has another
consequence: the number of degrees of freedom is consequently reduced
by the number of bins removed.
'''
diff = ds1 - ds2
if nonzero_bins is None:
return np.sum(((diff.value / diff.error)**2))
return np.sum(((diff.value / diff.error)**2)[nonzero_bins])
[docs]
def evaluate(self):
'''Evaluate χ² test method.
:rtype: :class:`~.TestResultChi2`
'''
chi2 = []
for _ds, _nzb in zip(self.datasets, self.nonzero_bins):
chi2.append(self.chi2_test(self.dsref, _ds, _nzb))
pvalue = [self.pvalue(_chi2, _ndf)
for _chi2, _ndf in zip(chi2, self.ndf)]
return TestResultChi2(self, chi2, pvalue)
[docs]
def data(self):
'''Generator yielding objects supporting the buffer protocol that (as a
whole) represent a serialized version of `self`.'''
yield from super().data()
yield self.__class__.__name__.encode('utf-8')
yield float(self.alpha).hex().encode('utf-8')
n_bytes = (self.ignore_empty.bit_length() - 1) // 8 + 1
yield self.ignore_empty.to_bytes(n_bytes, 'little')