Skip to Main content Skip to Navigation
New interface
Journal articles

Interpretable hierarchical symbolic regression for safety-critical systems with an application to highway crash prediction

Thomas Veran 1, 2 Pierre-Edouard Portier 1 François Fouquet 
1 DRIM - Distribution, Recherche d'Information et Mobilité
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
2 BD - Base de Données
LIRIS - Laboratoire d'InfoRmatique en Image et Systèmes d'information
Abstract : We introduce a framework to discover interpretable regression models for high-stakes decision making in the context of safety-critical systems. The core of our proposal is a multi-objective hierarchical symbolic regression algorithm able to compute cluster-specific rankings of regression models ordered by increasing complexity. We discover the hierarchical structure by clustering the features' importances of a post-hoc explainability framework (viz., SHAP) applied to a highly flexible predictive model (viz., XGBoost). We rely on a symbolic regression algorithm based on the simulated annealing meta-heuristic to infer sparse linear models which may include non-linear effects (e.g., log-transforms, multiplicative interactions...). This search is guided by two objectives: maximizing predictive performance and minimizing complexity. It ends on a list of Paretooptimal models that fosters a dynamic interpretative process: the user navigates from the least to the most complex model and decides the ones he can trust depending on whether he understands them, and whether he is satisfied by the quantified uncertainty of their parameters and predictions. Our approach achieves promising results when compared to more than ten other interpretable or black-box predictive models on eleven public regression datasets of various volumes, dimensionalities or domains, and on a proprietary dataset for highway crash prediction. On this last dataset, we demonstrate the usefulness of our new ranking-by-complexity of inherently interpretable models.
Complete list of metadata

https://hal-cnrs.archives-ouvertes.fr/hal-03819953
Contributor : Pierre-Edouard Portier Connect in order to contact the contributor
Submitted on : Tuesday, October 18, 2022 - 4:45:55 PM
Last modification on : Saturday, October 22, 2022 - 5:28:37 AM

File

 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-10-19

Please log in to resquest access to the document

Identifiers

  • HAL Id : hal-03819953, version 1

Citation

Thomas Veran, Pierre-Edouard Portier, François Fouquet. Interpretable hierarchical symbolic regression for safety-critical systems with an application to highway crash prediction. Engineering Applications of Artificial Intelligence, 2022. ⟨hal-03819953⟩

Share

Metrics

Record views

0