Welcome to the maxsmooth
documentation!¶
maxsmooth: Derivative Constrained Function Fitting¶
Introduction¶
maxsmooth: | Derivative Constrained Function Fitting |
---|---|
Author: | Harry Thomas Jones Bevins |
Version: | 1.2.1 |
Homepage: | https://github.com/htjb/maxsmooth |
Documentation: | https://maxsmooth.readthedocs.io/ |
Installation¶
In the following two sections we highlight the purpose of maxsmooth
and
show an example. To install the software follow these instructions:
The software can be pip installed from the PYPI repository like so,
pip install maxsmooth
or alternatively it can be installed from the git repository via,
git clone https://github.com/htjb/maxsmooth.git
cd maxsmooth
python setup.py install --user
Derivative Constrained Functions and maxsmooth
¶
maxsmooth
is an open source software, written in Python (supporting version 3 upwards),
for fitting derivative constrained
functions (DCFs) such as Maximally Smooth Functions
(MSFs) to data sets. MSFs are functions for which there are no zero
crossings in derivatives of order m >= 2 within the domain of interest.
More generally for DCFs the minimum
constrained derivative order, m can take on any value or a set of
specific high order derivatives can be constrained.
They are designed to prevent the loss of
signals when fitting out dominant smooth foregrounds or large magnitude signals that
mask signals of interest. Here “smooth” means that the foregrounds follow power
law structures in the band of interest.
In some cases DCFs can be used to
highlight systematics in the data.
maxsmooth
uses quadratic programming implemented with CVXOPT
to fit
data subject to a fixed linear constraint, Ga <= 0, where the product
Ga is a matrix of derivatives.
The constraint on an MSF are not explicitly
linear and each constrained derivative can be positive or negative.
maxsmooth
is, however, designed to test the <= 0 constraint multiplied
by a positive or negative sign. Where a positive sign in front of the mth
order derivative forces the derivative
to be negative for all x. For an Nth order polynomial maxsmooth
can test
every available sign combination but by default it implements a sign navigating algorithm.
This is detailed in the maxsmooth
paper (see citation), is summarized
below and in the software documentation.
The available sign combinations act as discrete parameter spaces all with
global minima and maxsmooth
is capable of finding the minimum of these global
minima by implementing a cascading algorithm which is followed by a directional
exploration. The cascading routine typically finds an approximate to the global
minimum and then the directional exploration is a complete search
of the sign combinations in the neighbourhood
of that minimum. The searched region is limited by factors
that encapsulate enough of the neighbourhood to confidently return the global minimum.
The sign navigating method is reliant on the problem being “well defined” but this
is not always the case and it is in these instances it is possible to run the code testing
every available sign combination on the constrained derivatives. For a definition of
a “well defined” problem and it’s counter part see the maxsmooth
paper and the
documentation.
maxsmooth
features a built in library of DCFs or
allows the user to define their own. The addition of possible inflection points
and zero crossings in higher order derivatives is also available to the user.
The software has been designed with these two
applications in mind and is a simple interface.
Example Fit¶
Shown below is an example MSF fit performed with maxsmooth
to data that
follows a y = x-2.5 power law with a randomly generated Gaussian
noise with a standard deviation 0.02. The top panel shows the data and the
bottom panel shows the residual
after subtraction of the MSF fit alongside the actual noise in the data.
The software using the default built-in DCF model is shown to be
capable of recovering the random noise.
Further examples can be found in the Documentation (https://maxsmooth.readthedocs.io/) and in the github repository in the files ‘example_codes/’ and ‘example_notebooks/’ (notebooks can also be accessed online here).
Licence and Citation¶
The software is free to use on the MIT open source license. However if you use
the software for academic purposes we request that you cite the maxsmooth
papers. They are detailed below.
MNRAS paper (referred to throughout the documentation as the maxsmooth
paper),
H. T. J. Bevins et al., maxsmooth: Rapid maximally smooth function fitting with applications in Global 21-cm cosmology, Monthly Notices of the Royal Astronomical Society, 2021;, stab152, https://doi.org/10.1093/mnras/stab152
Below is the BibTex citation,
@article{10.1093/mnras/stab152,
author = {Bevins, H T J and Handley, W J and Fialkov, A and Acedo, E de Lera and Greenhill, L J and Price, D C},
title = "{maxsmooth: rapid maximally smooth function fitting with applications in Global 21-cm cosmology}",
journal = {Monthly Notices of the Royal Astronomical Society},
year = {2021},
month = {01},
issn = {0035-8711},
doi = {10.1093/mnras/stab152},
url = {https://doi.org/10.1093/mnras/stab152},
note = {stab152},
eprint = {https://academic.oup.com/mnras/advance-article-pdf/doi/10.1093/mnras/stab152/35931358/stab152.pdf},
}
JOSS paper,
Bevins, H. T., (2020). maxsmooth: Derivative Constrained Function Fitting. Journal of Open Source Software, 5(54), 2596, https://doi.org/10.21105/joss.02596
and the BibTex,
@article{Bevins2020,
doi = {10.21105/joss.02596},
url = {https://doi.org/10.21105/joss.02596},
year = {2020},
publisher = {The Open Journal},
volume = {5},
number = {54},
pages = {2596},
author = {Harry T. j. Bevins},
title = {maxsmooth: Derivative Constrained Function Fitting},
journal = {Journal of Open Source Software}
}
Contributing¶
Contributions to maxsmooth
are welcome and can be made via:
- Opening an issue to purpose new features/report bugs.
- Making a pull request. Please consider opening an issue to discuss any proposals beforehand and ensure that your PR will be accepted.
An example contribution may be the addition of a basis function into the standard library.
Documentation¶
The documentation is available at: https://maxsmooth.readthedocs.io/
Alternatively, it can be compiled locally from the git repository and requires sphinx to be installed. You can do this via:
cd docs/
make SOURCEDIR=source html
or
cd docs/
make SOURCEDIR=source latexpdf
The resultant docs can be found in the docs/_build/html/ and docs/_build/latex/ respectively.
Requirements¶
To run the code you will need the following additional packages:
When installing via pip or from source using the setup.py file the above packages will also be installed if absent.
To compile the documentation locally you will need:
To run the test suit you will need:
Basin-hopping/Nelder-Mead Code¶
In the maxsmooth
MNRAS paper and JOSS paper we provide a comparison of
maxsmooth
to a Basin-hopping/Nelder-Mead approach for fitting DCFs. For
completeness we provide in this repo the code used to make this comparison
in the file ‘Basin-hopping_Nelder_Mead/’.
The code times_chis.py is used to call maxsmooth
and the Basin-hopping
methods (in the file ‘BHNM/’). It will plot the recorded times and objective
function evaluations.
The Basin-hopping/Nelder-Mead code is designed to fit MSFs and is not generalised to all types of DCF. It is also not documented, however there are minor comments in the script and it should be self explanatory. Questions on this are welcome and can be posted as an issue or by contacting the author.
maxsmooth
Theory and Algorithm¶
This section has been adapted from section 4 of the maxsmooth
paper
in order to explain how the algorithm works. What follows is a discussion of
the fitting problem and the
maxsmooth
algorithm. To state concisely the problem being fitted we have
where \({\mathbf{s}}\) are the maxsmooth
signs corresponding to the
signs on the derivatives. \({\mathbf{G}}\) is a matrix of prefactors on the derivatives,
\({\mathbf{a}}\) are the parameters we are optimising for and their
product gives the derivatives we are constraining with each fit.
\({\mathbf{Q}}\) is the dot product of the matrix of basis functions and
its transpose and \(\mathbf{q}\) is the negative of the transposed data,
\(\mathbf{y}\) dotted with the basis functions. For more details on this
equation see the maxsmooth
paper.
A `problem’ in this context is the combination of the data, order, basis
function and constraints on the DCF.
With maxsmooth
we can test all possible sign combinations on the constrained derivatives.
This is a
reliable method and, provided the problem can be solved with quadratic programming,
will always give the correct global minimum. When the problem we are interested
in is “well defined”, we can develop a quicker algorithm that searches or navigates
through the discrete maxsmooth
sign spaces to find the global minimum.
Each sign space is a discrete parameter space with its own global minimum.
Using quadratic programming on a fit with a specific sign combination will
find this global minimum, and we are interested in finding the minimum
of these global minima.
A “well defined” problem is one in which the discrete sign spaces have large variance in their minimum \({\chi^2}\) values and the sign space for the global minimum is easily identifiable. In contrast we can have an “ill defined” problem in which the variance in minimum \({\chi^2}\) across all sign combinations is small. This concept of “well defined” and “ill defined” problems is explored further in the following two sections.
Well Defined Problems and Discrete Sign Space Searches¶
The \({\chi^2}\) Distribution¶
We investigate the distribution of \({\chi^2}\) values, shown in the figure below, for a 10 \({^{th}}\) order y-log(x) space MSF fit to a \({y = x^{-2.5}}\) power law plus gaussian noise.
In the figure, a combination of all positive derivatives~(negative signs) and
all negative derivatives~(positive signs) corresponds to sign combination numbers
255 and 0 respectively. Specifically, the maxsmooth
signs, \({\mathbf{s}}\),
are related to the sign combination number by its \({C}\) bit binary representation,
here \({C = (N -2)}\). In binary the sign combination numbers run from
00000000 to 11111111. Each bit represents the sign on the \({m^{th}}\)
order derivative with a 1 representing a negative maxsmooth
sign.
The distribution appears to be composed of smooth steps or shelves; however,
when each shelf is studied closer, we find a series of peaks and troughs. This can
be seen in the subplot of the above figure which shows the distribution in the
neighbourhood of the global minimum found in the large or global’ well. This type
of distribution with a large variance in :math:`{chi^2} is characteristic of a “well defined”
problem. We use this example \({\chi^2}\) distribution to motivate the maxsmooth
algorithm outlined in the following section.
Ill Defined Problems and their Identification¶
We can illustrate an “ill defined” problem, with a small variation in
\({\chi^2}\) across the maxsmooth
sign spaces, by adding a non-smooth signal
of interest into the foreground model, \({x^{-2.5}}\) and fitting this with
a 10 \({^{th}}\) order log(y)-log(x) space MSF. We add an additional noise of
\({0.020}\) to the mock data. The resultant \({\chi^2}\) distribution with its
global minimum is shown in the top panel of the figure below.
The global minimum, shown as a black data point, cannot be found using the
maxsmooth
algorithm. The cascading algorithm may terminate in any of the
approximately equal minima and the directional exploration will then quickly
terminate because of the limits imposed.
If we repeat the above fit and perform it with a y-x space MSF we find that the
problem is well defined with a larger \({\chi^2}\) variation across sign
combinations. This is shown in the bottom panel of the above figure. The results,
when using the log(y)-log(x) space MSF, are significantly better than when using
y-x space MSF meaning it is important to be able to solve “ill defined” problems.
This can be done by testing all maxsmooth
signs but knowing when this is
necessary is important if you are expecting to run multiple DCF fits to the
same data set. We can focus on diagnosing whether a DCF fit to the data is
“ill defined” because a joint fit to the same data set of a DCF and signal
of interest will also feature an “ill defined” \({\chi^2}\) distribution.
We can identify an “ill defined” problem by producing the equivalent of
the above figure using maxsmooth
and visually assessing the \({\chi^2}\)
distribution for a DCF fit. Alternatively, we can use the parameter space plots,
detailed in the maxsmooth
paper and later in this documentation,
to identify whether the constraints are weak or not, and if a local minima is
returned from the sign navigating routine then the minimum in these plots
will appear off centre.
Assessment of the first derivative of the data can also help to identify an “ill defined” problem. For the example problem this is shown in the figure below where the derivatives have been approximated using \({\Delta y/ \Delta x}\). Higher order derivatives of the data will have similarly complex or simplistic structures in the respective spaces. There are many combinations of parameters that will provide smooth fits with similar \({\chi^2}\) values in logarithmic space leading to the presence of local minima. This issue will also be present in any data set where the noise or signal of interest are of a similar magnitude to the foreground in y - x space.
maxsmooth
Example Codes¶
This section is designed to introduce the user to the software and the form in which it is run. It provides basic examples of data fitting with a built in MSF model and a user defined model.
There are also examples of functions that can be used pre-fitting and post-fitting for various purposes including; determination of the best DCF model from the built in library for the problem being fitted, analysis of the \({\chi^2}\) distribution as a function of the discrete sign spaces and analysis of the parameter space surrounding the optimum results.
The data used for all of this examples is available here.
The example codes can be found here and corresponding Jupyter Notebooks are provided here.
Simple Example code¶
In order to run the maxsmooth
software using the built
in DCF models for a simple fit the user can follow the simple structure detailed here.
An important point to make is that by default maxsmooth
fits a
Maximally Smooth Function or MSF to the data. An MSF, as stated in
the introduction to the documentation, is a function which has
derivatives of order \({m \geq 2}\) constrained so that they do not cross
0. This means that they do not have inflection points or non smooth
structure produced by higher order derivatives. More generally a DCF
follows the constraint,
for every constrained order \({m}\). The set of \({m}\) can be any set of derivative orders as long as those derivatives exist for the function.
This means we can use maxsmooth
to produce different DCF
models. MSFs are one of two special cases of DCF and we can also
have a Completely Smooth Function (CSF) with orders \({m \geq 1}\)
constrained. Alternatively we can have Partially Smooth Functions
(PSF) which are much more general and can have arbitrary sets of
derivatives constrained. We illustrate how this is implemented
towards the end of this example but we begin with the default case
fitting a MSF.
The user should begin by importing the smooth class from maxsmooth.DCF.
from maxsmooth.DCF import smooth
The user should then import the data they wish to fit.
import numpy as np
x = np.load('Data/x.npy')
y = np.load('Data/y.npy') + np.random.normal(0, 0.02, len(x))
and define the polynomial orders they wish to fit.
N = [3, 4, 5, 6, 7, 8, 9, 10, 11]
for i in range(len(N)):
`act on N[i]`
or for example,
N = 15
We can also plot the data to illustrate what is happening. Here the data is a scaled \({x^{-2.5}}\) power law and I have added gaussian noise in with a standard deviation of 0.02.
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.show()
smooth can be called as is shown below. It takes the x and y data as standard inputs as well as the order of the fit. There are a set of keyword arguments also available that change the type of function being fitted and these are detailed in the documentation.
result = smooth(x, y, N)
and it’s resulting attributes can be accessed by writing
result.attribute_name
. For example printing the outputs is done like
so,
print('Objective Funtion Evaluations:\n', result.optimum_chi)
print('RMS:\n', result.rms)
print('Parameters:\n', result.optimum_params)
print('Fitted y:\n', result.y_fit)
print('Sign Combinations:\n', result.optimum_signs)
print('Derivatives:\n', result.derivatives)
plt.plot(x, y - result.y_fit)
plt.xlabel('x', fontsize=12)
plt.ylabel(r'$\delta y$', fontsize=12)
plt.tight_layout()
plt.show()
To fit the data with a CSF we can use the ‘constraints’ keyword argument in smooth(). ‘constraints’ sets the minimum constrained derivative for the function which for a CSF we want to be one.
res = smooth(x, y, N, constraints=1)
Note in the printed results the number of constrained derivatives has increased by 1 and the only derivative that is allowed to cross through 0 (Zero Crossings Used?) is the the \({0^{th}}\) order i.e. the data.
plt.plot(x, y - res.y_fit)
plt.xlabel('x', fontsize=12)
plt.ylabel(r'$\delta y$', fontsize=12)
plt.tight_layout()
plt.show()
A Partially Smooth Function can have derivatives constrained via \({m \geq a}\) where \({a}\) is any order above 2 or it can have a set of derivatives that are allowed to cross zero. For the first case we can once again use the ‘constraints’ keyword argument. For example we can constrain derivatives with orders \({\geq 3}\) which will allow the \({1^{st}}\) and \({2^{nd}}\) order derivatives to cross zero. This is useful when our data features an inflection point we want to model with our fit.
res = smooth(x, y, N, constraints=3)
plt.plot(x, y - res.y_fit)
plt.xlabel('x', fontsize=12)
plt.ylabel(r'$\delta y$', fontsize=12)
plt.tight_layout()
plt.show()
To allow a particular set of derivatives to cross zero we use the ‘zero_crossings’ keyword. In the example below we are lifting the constraints on the \({3^{rd}}\), \({4^{th}}\) and \({5^{th}}\) order derivatives but our minimum constrained derivative is still set at the default 2. Therefore this PSF has derivatives of order \({m = [2, 6, 7, 8, 9]}\) constrained via the condition at the begining of this example code.
res = smooth(x, y, N, zero_crossings=[3, 4, 5])
plt.plot(x, y - res.y_fit)
plt.xlabel('x', fontsize=12)
plt.ylabel(r'$\delta y$', fontsize=12)
plt.tight_layout()
plt.show()
While PSFs can seem like an attractive way to improve the quality of fit they are less ‘smooth’ than a MSF or CSF and consequently they can introduce additional turning points in to your residuals obscuring any signals of intrest.
Turning Points and Inflection Points¶
This example will walk the user through implementing DCF fits to data sets with
turning points and inflection points. It builds on the details in the
‘Simple Example Code’ and uses the ‘constraints’ keyword argument introduced
there. The ‘constraints’ keyword argument is used to adjust the type of DCF that
is being fitted. Recall that by default maxsmooth
implements a Maximally
Smooth Function or MSF with constraints=2 i.e. derivatives of order \({m \geq 2}\)
constrained so that they do not cross zero. This allows for turning points in the
DCF as illustrated below.
We start by generating some noisy data that we know will include a turning point and defining the order of the DCF we would like to fit.
import numpy as np
x = np.linspace(-10, 10, 100)
noise = np.random.normal(0, 0.02, 100)
y = x**(2) + noise
N = 10
We can go ahead and plot this data just to double check it features a turning point.
import matplotlib.pyplot as plt
plt.plot(x, y)
plt.xlabel('x', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.show()
As already stated maxsmooth
does not constrain the first derivative of the
DCF by default so we can go ahead and fit the data.
from maxsmooth.DCF import smooth
res = smooth(x, y, N)
If we than plot the resultant residuals we will see that despite the data having a turning point present we have recovered the Gaussian noise.
plt.plot(x, y- res.y_fit, label='Recovered Noise')
plt.plot(x, noise, label='Actual Noise')
plt.ylabel(r'$\delta y$', fontsize=12)
plt.xlabel('x', fontsize=12)
plt.legend()
plt.show()
To illustrate what happens when there is an inflection point in the data we can define some sinusoidal data as so.
x = np.linspace(1, 5, 100)
noise = np.random.normal(0, 0.02, 100)
y = np.sin(x) + noise
N = 10
plt.plot(x, y)
plt.xlabel('x', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.show()
If we proceed to fit this with smooth() in its default settings we will get a poor fit as by default the second derivative is constrained. We need to lift this constraint to allow for the prominent inflection point to be modelled. We do this by setting the keyword argument constraints=3 creating a Partially Smooth Function or PSF.
res_msf = smooth(x, y, N)
res_psf = smooth(x, y, N, constraints=3)
plt.plot(x, y, label='Data')
plt.plot(x, res_msf.y_fit, label=r'MSF fit, $m \geq 2$')
plt.plot(x, res_psf.y_fit, label=r'PSF fit, $m \geq 3$')
plt.xlabel('x', fontsize=12)
plt.ylabel('y', fontsize=12)
plt.legend()
plt.show()
Finally, we can plot the residuals to further see that by lifting the constraint on the second derivative we have allowed an inflection point in the data.
plt.plot(x, y- res_psf.y_fit, label='Recovered Noise')
plt.plot(x, noise, label='Actual Noise')
plt.ylabel(r'$\delta y$', fontsize=12)
plt.xlabel('x', fontsize=12)
plt.legend()
plt.show()
New Basis Example¶
This example code illustrates how to define your own basis function for the DCF model. It implements a modified version of the built in normalized polynomial model but the structure is the same for more elaborate models.
As always we need to import the data, define an order \({N}\) and import the function fitting routine, smooth().
import numpy as np
from maxsmooth.DCF import smooth
x = np.load('Data/x.npy')
y = np.load('Data/y.npy')
N=10
There are several requirements needed to define a new basis function completely
for maxsmooth
to be able to fit it. They are as summarized below and then
examples of each are given in more detail,
args: Additional non-standard arguments needed in the definition of the basis. The standard arguments are the data (x and y), the order of the fit N, the pivot point about which a model can be fit, the derivative order \({m}\) and the params. While the pivot point is not strictly needed it is a required argument for the functions defining a new basis to help the user in their definition.
basis_functions: This function defines the basis of the DCF model, \({\phi}\) where the model can be generally defined as,
\[y = \sum_{k = 0}^N a_k \phi_k(x)\]where \({a_k}\) are the fit parameters.
model: This is the function described by the equation above.
derivative: This function defines the \({m^{th}}\) order derivative.
derivative_pre: This function defines the prefactors, \({\mathbf{G}}\) on the derivatives where
CVXOPT
, the quadratic programming routine used, evaluates the constraints as,\[\mathbf{Ga} \leq \mathbf{h}\]where \({\mathbf{a}}\) is the matrix of parameters and \({\mathbf{h}}\) is the matrix of constraint limits. For more details on this see the
maxsmooth
paper.
We can begin defining our new basis function by defining the additional arguments needed to fit the model as a list,
arguments = [x[-1]*10, y[-1]*10]
The next step is to define the basis functions \({\phi}\). This needs to be done in a function that has the arguments (x, y, pivot_point, N, *args). ‘args’ is optional but since we need them for this basis we are passing it in.
The basis functions, \({\phi}\), should be an array of dimensions len(x) by N and consequently evaluated at each N and x data point as shown below.
def basis_functions(x, y, pivot_point, N, *args):
phi = np.empty([len(x), N])
for h in range(len(x)):
for i in range(N):
phi[h, i] = args[1]*(x[h]/args[0])**i
return phi
We can define the model that we are fitting in a function like that shown below. This is used for evaluating \({\chi^2}\) and returning the optimum fitted model once the code has finished running. It requires the arguments (x, y, pivot_point, N, params, *args) in that order and again where ‘args’ is optional. ‘params’ is the parameters of the fit, \({\mathbf{a}}\) which should have length \({N}\).
The function should return the fitted estimate of y.
def model(x, y, pivot_point, N, params, *args):
y_sum = args[1]*np.sum([
params[i]*(x/args[0])**i
for i in range(N)], axis=0)
return y_sum
Next we have to define a function for the derivatives of the model which takes arguments (m, x, y, N, pivot_point, params, *args) where \({m}\) is the derivative order. The function should return the \({m^{th}}\) order derivative evaluation and is used for checking that the constraints have been met and returning the derivatives of the optimum fit to the user.
def derivative(m, x, y, N, pivot_point, params, *args):
mth_order_derivative = []
for i in range(N):
if i <= m - 1:
mth_order_derivative.append([0]*len(x))
for i in range(N - m):
mth_order_derivative_term = args[1]*np.math.factorial(m+i) / \
np.math.factorial(i) * \
params[int(m)+i]*(x)**i / \
(args[0])**(i + 1)
mth_order_derivative.append(
mth_order_derivative_term)
return mth_order_derivative
Finally we have to define \({\mathbf{G}}\) which is used by CVXOPT
to
build the derivatives and constrain the functions. It takes arguments
(m, x, y, N, pivot_point, *args) and should return the prefactor on the
\({m^{th}}\) order derivative. For a more thorough definition of the
prefactor on the derivative and an explanation of how the problem is
constrained in quadratic programming see the maxsmooth
paper.
def derivative_pre(m, x, y, N, pivot_point, *args):
mth_order_derivative = []
for i in range(N):
if i <= m - 1:
mth_order_derivative.append([0]*len(x))
for i in range(N - m):
mth_order_derivative_term = args[1]*np.math.factorial(m+i) / \
np.math.factorial(i) * \
(x)**i / \
(args[0])**(i + 1)
mth_order_derivative.append(
mth_order_derivative_term)
return mth_order_derivative
With our functions and additional arguments defined we can pass these
to the maxsmooth
smooth() function as is shown below. This overwrites the
built in DCF model but you are still able to modify the fit type i.e. testing all
available sign combinations or sampling them.
result = smooth(x, y, N,
basis_functions=basis_functions, model=model,
derivatives=derivative, der_pres=derivative_pre, args=arguments)
The output of the fit can be accessed as before,
print('Objective Funtion Evaluations:\n', result.optimum_chi)
print('RMS:\n', result.rms)
print('Parameters:\n', result.optimum_params)
print('Fitted y:\n', result.y_fit)
print('Sign Combinations:\n', result.optimum_signs)
print('Derivatives:\n', result.derivatives)
Best Basis Example¶
This function can be used to identify which of the built in DCFs fits the data best before running joint fits.
To use it we begin by loading in the data,
import numpy as np
x = np.load('Data/x.npy')
y = np.load('Data/y.npy')
and then importing the basis_test() function.
from maxsmooth.best_basis import basis_test
To call the function we use,
basis_test(x, y, base_dir='examples/', N=np.arange(3, 16, 1))
The function only requires the data but we can provide it with a base directory,
fit type and range of DCF orders to test. By defualt it uses the sign navigating
algorithm and tests \({N = 3 - 13}\). Here we test the range
:math:{N = 3 - 15}
.
The resultant graph is saved in the
base directory and the example generated here is shown below.
The graph shows us which basis is the optimum for solving this problem from the
built in library (that which can reach the minimum :math:{\chi^2}
). If we
were to go to higher N we would also find that the :math:{\chi^2}
value
would stop decreasing in value. The value of N for which this occurs at is the
optimum DCF order. (See the maxsmooth
paper for a real world application
of this concept.)
We can also provide this function with additional arguments such as the
fit type, minimum constrained derivative, directional exploration limits
ect. (see the maxsmooth
Functions section).
\({\chi^2}\) Distribution Example¶
This example will show you how to generate a plot of the \({\chi^2}\) distribution as a function of the discrete sign combinations on the constrained derivatives.
First you will need to import your data and fit this using maxsmooth
as
was done in the simple example code.
import numpy as np
x = np.load('Data/x.npy')
y = np.load('Data/y.npy')
from maxsmooth.DCF import smooth
N = 10
result = smooth(x, y, N, base_dir='examples/',
data_save=True, fit_type='qp')
Here we have used some additional keyword arguments for the ‘smooth’ fitting
function. ‘data_save’ ensures that the files containing the tested sign combinations
and the corresponding objective function evaluations exist in the base directory
which we have changed to ‘base_dir=’examples/’’. These files are essential for
the plotting the \({\chi^2}\) distribution and are not saved by maxsmooth
without ‘data_save=True’. We have also set the ‘fit_type’ to ‘qp’ rather than the
default ‘qp-sign_flipping’. This ensures that all of the available sign
combinations are tested rather than a sampled set giving us a full picture of the
distribution when we plot it. We have used the default DCF model to fit this data.
We can import the ‘chi_plotter’ like so,
from maxsmooth.chidist_plotter import chi_plotter
and produce the fit which gets placed in the base directory with the following code,
chi_plotter(N, base_dir='examples/', fit_type='qp')
We pass the same ‘base_dir’ as before so that the plotter can find the correct output files. We also give the function the same ‘fit_type’ used for the fitting which ensures that the files can be read.
The resultant plot is shown below and the yellow star shows the global minimum. This can be used to determine how well the sign sampling approach using a descent and directional exploration can find the global minimum. If the distribution looks like noise then it is unlikely the sign sampling algorithm will consistently find the global minimum. Rather it will likely repeatedly return the local minima found after the descent algorithm and you should use the ‘qp’ method testing all available sign combinations in any future fits to the data with this DCF model.
Parameter Plotter Example¶
We can assess the parameter space around the optimum solution
found using maxsmooth
with the param_plotter() function.
This can help us identify how well a problem can be solved using the
sign navigating approach employed by maxsmooth
or simply
be used to identify correlations between the foreground parameters.
For more details on this see the maxsmooth
paper.
We begin by importing and fitting the data as with the chi_plotter() function illustrated above.
import numpy as np
x = np.load('Data/x.npy')
y = np.load('Data/y.npy')
from maxsmooth.DCF import smooth
N = 5
result = smooth(x, y, N, base_dir='examples/', fit_type='qp')
We have changed the order of the fit to 5 to illustrate that for order \({N \leq 5}\) and fits with derivatives \({m \geq 2}\) constrained the function will plot each region of the graph corresponding to different sign combinations in a different colourmap. Recall that by default the function smooth() fits a maximally smooth function (MSF) with derivatives of order \({m \geq 2}\). If the constraints are different or the order is greater than 5 then the viable regions will have a single colourmap. Invalid regions are plotted as black shaded colourmaps and the contour lines are contours of \({\chi^2}\).
Specifically, invalid regions violate the condition
where \({m}\) represents the derivative order, \({y}\) is the dependent
variable and \({x}\) is the independent variable. Violation of the
condition means that one or more of the constrained derivatives crosses 0 in the
band of interest. For an MSF, as mentioned, \({m \geq 2}\) and the sign \({\pm_m}\)
applies to specific derivative orders. For this specific example there are
3 constrained derivatives, \({m = 2, 3, 4}\) and consequently 3 signs to
optimise for alongside the parameters \({a_k}\). The coloured valid regions
therefore correspond to a specific combination of \({\pm_m}\) for the problem.
\({\pm_m}\) is also referred to as \({\mathbf{s}}\) in the theory
section and the maxsmooth
paper.
We can import the function like so,
from maxsmooth.parameter_plotter import param_plotter
and access it using,
param_plotter(result.optimum_params, result.optimum_signs,
x, y, N, base_dir='examples/')
The function takes in the optimum parameters and signs found after the fit as well as the data and order of the fit. There are a number of keyword arguments detailed in the following section and the resultant fit is shown below. The function by default samples the parameter ranges 50% either side of the optimum and calculates 50 samples for each parameter. In each panel the two labelled parameters are varied while the others are maintained at their optimum values.
We are also able to plot the data, fit and residuals alongside the parameter plot and this can be done by setting data_plot=True. We can also highlight the central region in each panel of the parameter space by setting center_plot=True.
param_plotter(result.optimum_params, result.optimum_signs,
x, y, N, base_dir='examples/', data_plot=True, center_plot=True)
which gives us the graph below.
maxsmooth
Functions¶
This section details the specifics of the built in functions in maxsmooth
including
the relevant keyword arguments and default parameters for all. Where keyword arguments
are essential for the functions to run this is stated.
smooth()¶
smooth, as demonstrated in the examples section, is used to call the fitting routine. There are a number of \({^{**}}\) kwargs that can be assigned to the function which change how the fit is performed, the model that is fit and various other attributes. These are detailed below.
-
class
maxsmooth.DCF.
smooth
(x, y, N, **kwargs)[source]¶ Parameters:
- x: numpy.array
- The x data points for the set being fitted.
- y: numpy.array
- The y data points for fitting.
- N: int
- The number of terms in the DCF.
Kwargs:
- fit_type: Default = ‘qp-sign_flipping’
- This kwarg allows the user to switch between sampling the available discrete sign spaces (default) or testing all sign combinations on the derivatives which can be accessed by setting to ‘qp’.
- model_type: Default = ‘difference_polynomial’
- Allows the user to access default Derivative Constrained Functions built into the software. Available options include the default, ‘polynomial’, ‘normalised_polynomial’, ‘legendre’, ‘log_polynomial’, ‘loglog_polynomial’ and ‘exponential’. For more details on the functional form of the built in basis see the
maxsmooth
paper.
pivot_point: Default = len(x)//2 otherwise an integer between -len(x) and len(x)
Some of the built in models rely on pivot points in the data sets which by defualt is set as the middle index. This can be altered via this kwarg which can occasionally lead to a better quality fit.- base_dir: Default = ‘Fitted_Output/’
- The location of the outputted data from
maxsmooth
. This must be a string and end in ‘/’. If the file does not exist thenmaxsmooth
will create it. By default the only outputted data is a summary of the best fit but additional data can be recorded by setting the keyword argument ‘data_save = True’. - data_save: Default = False
- By setting this to True the algorithm will save every tested set of parameters, signs and objective function evaluations into files in base_dir. Theses files will be over written on repeated runs but they are needed to run the ‘chidist_plotter’.
- print_output: Default = 1
- The parameter can take a value of 0, 1, 2. If set to 2 this outputs to the terminal every fit performed by the algorithm. By default the parameter has a value of 1 and the only output is the optimal solution once the code is finished. Setting this to 0 will prevent any output to the terminal. This is useful when running the code inside a nested sampling loop for example.
- cvxopt_maxiter: Default = 10000 else integer
- This shouldn’t need changing for most problems however if
CVXOPT
fails with a ‘maxiters reached’ error message this can be increased. Doing so arbitrarily will however increase the run time ofmaxsmooth
. - initial_params: Default = None else list of length N
- Allows the user to overwrite the default initial parameters used by
CVXOPT
.
constraints: Default = 2 else an integer less than or equal to N - 1
The minimum constrained derivative order which is set by default to 2 for a Maximally Smooth Function.- zero_crossings: Default = None else list of integers
- Allows you to specify if the conditions should be relaxed on any of the derivatives between constraints and the highest order derivative. e.g. a 6th order fit with just a constrained 2nd and 3rd order derivative would have zero_crossings = [4, 5].
- cap: Default = (len(available_signs)//N) + N else an integer
- Determines the maximum number of signs explored either side of the minimum \({\chi^2}\) value found after the decent algorithm has terminated.
- chi_squared_limit: Default = 2 else float or int
- The prefactor on the maximum allowed increase in \({\chi^2}\) during the directional exploration which is defaulted at 2. If this value multiplied by the minimum \({\chi^2}\) value found after the descent algorithm is exceeded then the exploration in one direction is stopped and started in the other. For more details on this and ‘cap’ see the
maxsmooth
paper.
The following Kwargs can be used by the user to define their own basis function and will overwrite the ‘model_type’ kwarg.
basis_function: Default = None else function with parameters (x, y, pivot_point, N)
This is a function of basis functions for the quadratic programming. The variable pivot_point is the index at the middle of the datasets x and y by default but can be adjusted.model: Default = None else function with parameters (x, y, pivot_point, N, params)
This is a user defined function describing the model to be fitted to the data.der_pres: Default = None else function with parameters (m, x, y, N, pivot_point)
This function describes the prefactors on the mth order derivative used in defining the constraint.derivatives: Default = None else function with parameters (m, x, y, N, pivot_point, params)
User defined function describing the mth order derivative used to check that conditions are being met.- args: Default = None else list
- Extra arguments for smooth to pass to the functions detailed above.
Output
- .y_fit: numpy.array
- The fitted array of y data from smooth().
- .optimum_chi: float
- The optimum \({\chi^2}\) value for the fit calculated by,\[{X^2=\sum(y-y_{fit})^2}.\]
- .optimum_params: numpy.array
- The set of parameters corresponding to the optimum fit.
- .rms: float
- The rms value of the residuals \({y_{res}=y-y_{fit}}\) calculated by,\[{rms=\sqrt{\frac{\sum(y-y_{fit})^2}{n}}}\]
where \(n\) is the number of data points.
- .derivatives: numpy.array
- The \(m^{th}\) order derivatives.
- .optimum_signs: numpy.array
- The sign combinations corresponding to the optimal result. The nature of the constraint means that a negative
maxsmooth
sign implies a positive \({m^{th}}\) order derivative and visa versa.
best_basis()¶
As demonstrated, this function allows you to test the built in basis and their ability to fit the data. It produces a plot that shows chi squared as a function of \({N}\) for the 7 built in models and saves the figure to the base directory.
-
class
maxsmooth.best_basis.
basis_test
(x, y, **kwargs)[source]¶ Parameters:
- x: numpy.array
- The x data points for the set being fitted.
- y: numpy.array
- The y data points for fitting.
Kwargs:
- fit_type: Default = ‘qp-sign_flipping’
- This kwarg allows the user to switch between sampling the available discrete sign spaces (default) or testing all sign combinations on the derivatives which can be accessed by setting to ‘qp’.
- base_dir: Default = ‘Fitted_Output/’
- The location of the outputted graph from function. This must be a string and end in ‘/’. If the file does not exist then the function will create it.
N: Default = [3, .., 13] in steps of 1 else list or numpy array of integers
The DCF orders to test each basis function with. In some instances the basis function may fail for a given \({N}\) and higher orders due to overflow/underflow errors orCVXOPT
errors.pivot_point: Default = len(x)//2 otherwise an integer between -len(x) and len(x)
Some of the built in models rely on pivot points in the data sets which by defualt is set as the middle index. This can be altered via this kwarg which can occasionally lead to a better quality fit.constraints: Default = 2 else an integer less than or equal to N - 1
The minimum constrained derivative order which is set by default to 2 for a Maximally Smooth Function.- zero_crossings: Default = None else list of integers
- Allows you to specify if the conditions should be relaxed on any of the derivatives between constraints and the highest order derivative. e.g. a 6th order fit with just a constrained 2nd and 3rd order derivative would have zero_crossings = [4, 5].
- cap: Default = (len(available_signs)//N) + N else an integer
- Determines the maximum number of signs explored either side of the minimum \({\chi^2}\) value found after the decent algorithm has terminated.
- chi_squared_limit: Default = 2 else float or int
- The prefactor on the maximum allowed increase in \({\chi^2}\) during the directional exploration which is defaulted at 2. If this value multiplied by the minimum \({\chi^2}\) value found after the descent algorithm is exceeded then the exploration in one direction is stopped and started in the other. For more details on this and ‘cap’ see the
maxsmooth
paper. - cvxopt_maxiter: Default = 10000 else integer
- This shouldn’t need changing for most problems however if
CVXOPT
fails with a ‘maxiters reached’ error message this can be increased. Doing so arbitrarily will however increase the run time ofmaxsmooth
.
chidist_plotter()¶
This function allows the user to produce plots of the chi squared
distribution as a function of the available discrete sign spaces for the
constrained derivatives. This can be used to identify whether or not the
problem is ill defined, see the maxsmooth
paper for a definition,
and if it can be solved using the sign sampling approach.
It can also be used to determine whether or not the ‘cap’ and maximum allowed increase on the value of chi squared during the directional exploration are sufficient to identify the global minimum for the problem.
The function is reliant on the output of the maxsmooth
smooth() function.
The required outputs can be saved when running smooth()
using the ‘data_save = True’ kwarg.
-
class
maxsmooth.chidist_plotter.
chi_plotter
(N, **kwargs)[source]¶ Parameters:
- N: int
- The number of terms in the DCF.
Kwargs:
- fit_type: Default = ‘qp-sign_flipping’
- This kwarg is the same as for the smooth() function. Here it allows the files to be read from the base directory.
- base_dir: Default = ‘Fitted_Output/’
- The location of the outputted data from
maxsmooth
. This must be a string and end in ‘/’ and must contain the files ‘Output_Evaluations/’ and ‘Output_Signs/’ which can be obtained by running smooth() with data_save=True. - chi: Default = None else list or numpy array
- A list of chi squared evaluations. If provided then this is used over outputted data in the base directory. It must have the same length as the ouputted signs in the file ‘Output_Signs/’ in the base directory. It must also be ordered correctly otherwise the returned graph will not be correct. A correct ordering is one for which each entry in the array corresponds to the correct sign combination in ‘Output_Signs/’. Typically this will not be needed but if the chi squared evaluation in ‘Output_Evaluations/’ in the base directory is not in the desired parameter space this can be useful. For example the built in logarithmic model calculates chi squared in logarithmic space. To plot the distribution in linear space we can calculate chi squared in linear space using a function for the model and the tested parameters which are found in ‘Output_Parameters/’ in the base directory.
constraints: Default = 2 else an integer less than or equal to N - 1
The minimum constrained derivative order which is set by default to 2 for a Maximally Smooth Function. Used here to determine the number of possible sign combinations available.- zero_crossings: Default = None else list of integers
- Allows you to specify if the conditions should be relaxed on any of the derivatives between constraints and the highest order derivative. e.g. a 6th order fit with just a constrained 2nd and 3rd order derivative would have a zero_crossings = [4, 5]. Again this is used in determining the possible sign combinations available.
- plot_limits: Default = False
- Determines whether the limits on the directional exploration are plotted on top of the chi squared distribution.
- cap: Default = (len(available_signs)//N) + N else an integer
- Determines the maximum number of signs explored either side of the minimum chi squared value found after the decent algorithm has terminated when running smooth(). Here it is used when plot_limits=True.
- chi_squared_limit: Default = 2 else float or int
- The prefactor on the maximum allowed increase in chi squared during the directional exploration which is defaulted at 2. If this value multiplied by the minimum chi squared value found after the descent algorithm is exceeded then the exploration in one direction is stopped and started in the other. For more details on this and ‘cap’ see the
maxsmooth
paper. Again this is used here when plot_limits=True.
parameter_plotter()¶
This function allows you to plot the parameter space around the optimum
solution found when running maxsmooth
and visualise the constraints with
contour lines given by chi squared.
-
class
maxsmooth.parameter_plotter.
param_plotter
(best_params, optimum_signs, x, y, N, **kwargs)[source]¶ Parameters:
- best_params: numpy.array
- The optimum parameters found when running a DCF fit to the data.
- optimum_signs: numpy.array
- The optimum signs for the DCF fit which are used when the derivatives are equal to 0 across the band.
- x: numpy.array
- The x data points.
- y: numpy.array
- The y data points.
- N: int
- The number of terms in the DCF.
Kwargs:
- model_type: Default = ‘difference_polynomial’
- The functional form of the model being plotted. If a the user has defined their own basis they can supply this with the Kwargs below and this will be overwritten.
- base_dir: Default = ‘Fitted_Output/’
- The location in which the parameter plot is saved.
constraints: Default = 2 else an integer less than or equal to N - 1
The minimum constrained derivative order which is set by default to 2 for a Maximally Smooth Function. Used here to determine the number of possible sign combinations available.- zero_crossings: Default = None else list of integers
- Allows you to specify if the conditions should be relaxed on any of the derivatives between constraints and the highest order derivative. e.g. a 6th order fit with just a constrained 2nd and 3rd order derivative would have an zero_crossings = [4, 5]. Again this is used in determining the possible sign combinations available.
- samples: Default = 50
- The sampling rate across the parameter ranges defined with the optimum solution and width.
- width: Default = 0.5
- The range of each parameter to explore. The default value of 0.5 means that the \({\chi^2}\) values for parameters ranging 50% either side of the optimum result are tested.
- warnings: Default = True
- Used to highlight when a derivative is 0 across the band and that in these instances the optimum signs are assumed for the colourmap if \({N \leq 5}\), constraints=2 and the zero_crossings is empty.
- girdlines: Default = False
- Plots gridlines showing the central value for each parameter in each panel of the plot.
- center_plot: Default = False
- Setting this equal to True will highlight the central region with a white circle.
- data_plot: Default = False
- Setting to True will plot the data, fitted model and the residuals, \({y - y_{fit}}\), alongside the parameter graph.
The following Kwargs are used to plot the parameter space for a user defined basis function and will overwrite the ‘model_type’ kwarg.
basis_function: Default = None else function with parameters (x, y, pivot_point, N)
This is a function of basis functions for the quadratic programming. The variable pivot_point is the index at the middle of the datasets x and y by default but can be adjusted.model: Default = None else function with parameters (x, y, pivot_point, N, params)
This is a user defined function describing the model to be fitted to the data.der_pres: Default = None else function with parameters (m, x, y, N, pivot_point)
This function describes the prefactors on the mth order derivative used in defining the constraint.derivatives: Default = None else function with parameters (m, x, y, N, pivot_point, params)
User defined function describing the mth order derivative used to check that conditions are being met.- args: Default = None else list
- Extra arguments for smooth to pass to the functions detailed above.
Change Log¶
Unreleased changes are not yet included in the pip install but are pushed to the github.
Unrealeased¶
Version 1.1.0¶
- Two bug fixes in param_plotter()
- Extension of param_plotter() function to plot data, fit and residuals alongside the parameter space if required.
- Extension of param_plotter() to allow for highlighting of central regions in each panel if required.
- Inclusion of some theory into the documentation
Version 1.2.0¶
- Minor bug fix in param_plotter()
- Extension of the basis_test() function to allow users to compare different types of DCF not just MSFs.