Fitting
In FiberSim, fitting a model means trying to find the set of parameter values that yields the optimal match to some target data.
In principle, fitting sounds relatively easy - few readers of this documentation would struggle to fit a straight line to data, which implies finding the best combination of m
and c
for y = m x + c
.
In practice, fitting FiberSim models is typically more challenging, and sometimes feels more like an art than a science.
There are several reasons:
- FiberSim models take a long time (typically tens of seconds) to run, so testing each potential combination of parameters is time-consuming.
- There are lots of parameters one could adjust - how does one decide which parameters to try?
- FiberSim is stochastic, so each simulation has some (perhaps random?) uncertainty associated with it. The uncertainty can be reduced by averaging over more filaments, but this adds to the computational time.
- FiberSim is complicated so each test involves generating, or at least organizing, different combinations of model, protocol, and options files.
- How should the fit between the simulation and the target data be evaluated?
This set of demonstrations provides some examples. If you want to fit your own models, it’s likely that you are involved some sort of active research. Please reach out if you need more help with a specific idea.
Overview
The basic sequence is as follows.
Defining the fitting process
Fitting procedures are defined and initialized by a setup file
. An example is shown below. Several components are idential to those used for parameter adjustments.
"model":
{
"relative_to": "this_file",
"options_file": "sim_options.json",
"fitting":
{
"base_model": "model.json",
"generated_folder": "../generated",
"working_folder": "../working",
"progress_folder": "../progress",
"Python_objective_call": "../Python_code/return_fit.py",
"optimizer": "particle_swarm",
"initial_guess": [0.5, 0.5, 0.5],
"single_run": "False",
"adjustments":
[
{
"variable": "m_kinetics",
"isotype": 1,
"state": 2,
"transition": 1,
"parameter_number": 1,
"factor_bounds": [-1, 1],
"factor_mode": "log"
},
{
"variable": "m_kinetics",
"isotype": 1,
"state": 3,
"transition": 1,
"parameter_number": 1,
"factor_bounds": [-1, 1],
"factor_mode": "log"
},
{
"class": "thin_parameters",
"variable": "a_k_on",
"output_type": "float",
"factor_bounds": [0.5, 1.5]
}
]
}
}
The fitting section is as follows.
Parameter | Explanation |
---|---|
base_model | Path to the original model which will be adjusted in successive iterations to optimize the fit to the target data |
generated_folder | Path to a folder that will contain the ‘adjusted’ files |
working_folder | Path to another folder that contains files which define the current iteration |
progress_folder | Path to a folder that contain files that describe how the fit is evolving and the best fit obtained to date |
Python_objective_call | Path to a Python script that compares the output from a simulation to their target data and returns a single numerical value that quantifies the fit. In most cases, this will be some sort of least-squares error. Note that unless the code from a demo can be repurposed, the user needs to provide their own Python code. |
optimizer | particle_swarm or any method supported by scipy.optimize.minimize |
initial_guess | (optional) vector of p_values - see below |
single_run | True - runs a single trial and stops (useful for trouble-shooting) - or False to run a fit |
Adjustments and p vectors
Models are adjusted for successive iterations using a p_vector
and an array of parameter adjustments.
The length of the p_vector
is the same as the number of adjustments. Each element of the vector is constrained between 0 and 1. The mapping between the p_vector
element and the parameter value in the model is defined by the adjustment.
Fitting thus requires finding the p_vector
that produces the optimal fit to the experimental data.
The adjustment are best explained by example.
{
"variable": "m_kinetics",
"isotype": 1,
"state": 2,
"transition": 1,
"parameter_number": 1,
"factor_bounds": [-1, 1],
"factor_mode": "log"
}
As described in parameter adjustments, this section focuses on
- m_kinetics
- isotpye = 1
- state = 2
- transition = 1
- parameter_number = 1
factor_bounds
is a two-element array that defines the minimum and maximum scaling of the value in the base_model
.
factor_mode
is optional. When it is log
, the factor_bounds
are transferred to log10 space.
p
= 0 maps to the lower bound. p
= 1 maps to the upper bound. Intermediate values are interpolated linearly.
Assuming the parameter value in the base model
is 100
p | Value in new model |
---|---|
0 | 10-1 * 100 = 10 |
0.5 | 100 * 100 = 100 |
1 | 10 1 * 100 = 1000 |
Additional example
{
"class": "thin_parameters",
"variable": "a_k_on",
"output_type": "float",
"factor_bounds": [0.5, 2.5]
}
Assuming the parameter value in base_model
is 1e7.
p | Value in new model |
---|---|
0 | 0.5 * 1e7 = 5e6 |
0.5 | 1.5 * 1e7 = 1.5e7 |
1 | 2.5 * 1e7 = 2.5e7 |