Update tutorials#79
Conversation
| "$$\n", | ||
| "\n", | ||
| "$$\\frac{\\mathrm{d} \\text{predator}}{\\mathrm{d} t} = \\text{NN}(\\text{prey}, \\text{predator})[1] - \\delta \\cdot \\text{predator}$$" | ||
| "Measurements of both `prey` and `predator` are assumed. The goal of this tutorial is to set up a PEtab-SciML problem for estimating both mechanistic parameters (`alpha`, `delta`) and neural-network parameters (`theta`)." |
There was a problem hiding this comment.
Could it be misleading to mention measurements here? It is the model state of prey and predator that are inputs to the neural network, rather than the measurement data.
There was a problem hiding this comment.
I think it is good to mention which data we use to estimate parameters from early. I made it a bit more explicit measurements on prey and predator are used for estimation:
Time-series measurements of the model states prey and predator are available. The goal of this tutorial is to set up a PEtab-SciML problem to estimate both mechanistic parameters (alpha, delta) and neural-network parameters (theta) from these measurements.
| "$$\n", | ||
| "\n", | ||
| "$$\\frac{\\mathrm{d} \\text{predator}}{\\mathrm{d} t} = \\text{NN}(\\text{prey}, \\text{predator})[1] - \\delta \\cdot \\text{predator}$$" | ||
| "Measurements of both `prey` and `predator` are assumed. The goal of this tutorial is to set up a PEtab-SciML problem for estimating both mechanistic parameters (`alpha`, `delta`) and neural-network parameters (`theta`)." |
There was a problem hiding this comment.
Inline symbols for alpha, delta and theta might be nice, to clearly link them to the equations above.
There was a problem hiding this comment.
This one is a bit tricky. In the SBML model, and all tables we use alpha instead of alpha in the entire text (including equations).
I have updated the PR to this end, but happy to discuss this further :)
| "# Machine-learning models in observables\n", | ||
| "\n", | ||
| "This guide covers how to include a machine learning (ML) model in the observable formula, which links the model output to the observed measurement data. We assume some familiarity with the getting started tutorial, which examines an entire PEtab SciML problem, while this guide focuses on the parts that are relevant to the observable use case. As a case study we will use the Lotka-Volterra ODE system:\n", | ||
| "Sometimes mechanistic models can be misspecified, or the mapping from model states to measurements may be only partially known. Both scenarios can be addressed by augmenting the observable formula in the PEtab problem with a neural network.\n", |
There was a problem hiding this comment.
Can we frame "misspecified" in a different way? Perhaps referencing that mechanistic models are by necessity coarse grained?
There was a problem hiding this comment.
I like the misspecified here (but also, I am non-native speaker :), because the model can be wrong simply because we are wrong (very common), but it might also be to course-grained. So I think it captures most scenarios.
What more specifically might be problematic with misspecified?
dilpath
left a comment
There was a problem hiding this comment.
Thanks! Some comments apply to all notebooks, e.g. naming of inputs/outputs in the mapping table.
| "The environment and example PEtab files to run this notebook are provided in the PEtab SciML repo." | ||
| "This introductory tutorial shows how to set up a PEtab-SciML problem using [AMICI](https://amici.readthedocs.io/en/latest/index.html). It walks through the main PEtab-SciML problem files and focuses on creating a problem where a neural network enters the model dynamics, which is often called a universal differential equation (UDE) (also referred to as a grey-box model or hybrid Neural ODE). Familiarity with the PEtab v2 format is assumed; see the [PEtab tutorial](https://petab.readthedocs.io/en/latest/v2/tutorial/tutorial.html).\n", | ||
| "\n", | ||
| "The tutorial is provided as a notebook, available [here](https://github.com/PEtab-dev/petab_sciml/blob/main/doc/examples/getting_started/getting_started.ipynb), and the corresponding PEtab-SciML problem files can be downloaded [here](https://github.com/PEtab-dev/petab_sciml/tree/main/doc/examples/getting_started)." |
There was a problem hiding this comment.
This could be combined with the Environment section and simplified to e.g.
All files required to reproduce the results on this page are provided here. In particular, there is the Python 3 Jupyter notebook that generated this page, and the Python dependencies in requirements.txt.
There was a problem hiding this comment.
I believe this entire section will be dropped when AMICI is updated. So will leave this comment as reminder.
| "PEtab v2 (and, by extension, PEtab-SciML) accepts dynamic models in common exchange formats (e.g. SBML, CellML, BioNetGen). In this tutorial, an SBML model is used since it is widely supported across PEtab-SciML importers.\n", | ||
| "\n", | ||
| "$$\\frac{\\mathrm{d} \\text{prey}}{\\mathrm{d} t} = \\alpha \\cdot \\text{prey} - \\beta$$\n", | ||
| "In PEtab-SciML, neural-network outputs are linked to the dynamic model by assigning them to parameters in the model file. Therefore, the parts of the equations to be learned must be represented as parameters. In this example, the interaction terms are replaced by the parameters `beta` and `gamma`, which are later mapped to the network outputs to form a UDE. Thus, the model file corresponds to:\n", |
There was a problem hiding this comment.
gamma is a prior distribution in PEtab v2. e.g. change to gamma_? beta also used to be an issue I think, because sympy can convert that to the beta function when imported with AMICI, so I usually use beta_ as well just to be safe, but I guess it's OK now...
There was a problem hiding this comment.
This is an interesting point I did not think about. But as alpha, beta, gamma and delta are the canonical parameters of this LV system, lets hope no problems arise with AMICI.
There was a problem hiding this comment.
Actually I was wrong. gamma is not a reserved keyword in PEtab v2 even though it's a prior distribution keyword, which makes sense because if gamma appears in the priorDistribution column then it's clear it's the distribution, and if it appears anywhere else then it's clear that it's the parameter.
So, not even an issue I think.
| "metadata": {}, | ||
| "source": [ | ||
| "Note that where any specific network layers or parameters are referenced in the ``mapping.tsv``, it should refer to them by the layer ids in this file." | ||
| "Here, `nn_model_id` is the unique neural-network model ID, which is used throughout the PEtab-SciML problem to refer to this neural network (e.g. in the mapping table and problem yaml file)." |
There was a problem hiding this comment.
Replace neural-network, neural network, net, and network everywhere with NN? Just like we use ODE for the mechanistic model everywhere.
There was a problem hiding this comment.
This is a v1 conditions table.
In v2 (and probably also v1), the targetId in a v2 condition table cannot appear as a parameterId in a v2 parameter table.
Hence, we could add a note than only one of the two options presented here is possible (either in the parameter table for all conditions, or in the condition table for condition-specific... or in the array data file).
There was a problem hiding this comment.
Thanks, somehow I completely missed this section when updating the tutorials
| "Let's load the PEtab problem so that we can examine the contents of the relevant PEtab tables." | ||
| "## Defining ML models in observable formulas\n", | ||
| "\n", | ||
| "An ML model is used in an observable formula by (1) mapping the neural-network output to a PEtab identifier in the mapping table, (2) referencing that mapped output in the observables table, and (3) specifying the neural-network inputs in the hybridization table.\n", |
There was a problem hiding this comment.
No, for this kind of hybridization the input must be provided in the hybridization table (we do not want to change input equation depending on condition)
Co-authored-by: BSnelling <branwen.snelling@crick.ac.uk> Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #79 +/- ##
=======================================
Coverage 94.01% 94.01%
=======================================
Files 6 6
Lines 301 301
=======================================
Hits 283 283
Misses 18 18 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the feedback! I have now implemented it. Lets now wait for AMICI to update, then we can update the code, and finally merge this PR. |
Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
…_sciml into update_tutorials
This PR updates the PEtab-SciML tutorials to:
data.
The tutorial text is ready for review. The code snippets and accompanying PEtab files are
currently outdated and will be updated once the linter is in place. This can be done at a
later stage, and the PR should not be merged until then.
This PR is related to completing #23