Data-driven site characterization - Focus on small-strain stiffness

Revision as of 14:42, 6 June 2024

Abstract

Non-linear soil behaviour adds complexity in accurate parameter selection for numerical modelling. One of these parameters is the small-strain shear stiffness. This parameter depends strongly on the soil mass density and the shear wave velocity; the latter can be determined through in-situ tests or laboratory tests. The paper focuses on training various machine learning models to predict shear wave velocity estimates based on raw data from cone penetration test soundings. Three decision tree algorithms are considered for the analysis: XGBRegressor, HistGradientRegressor, and RandomForest. Various data preprocessing approaches are investigated, including noise removal and outlier identification, to assess their impact on the model performance. The results indicate that different data preprocessing approaches yield significant differences in the model performances. When applied to unseen raw data from a sand site of the Norwegian GeoTest Site, the model demonstrates promising predictive capabilities and is in a good agreement with well-known correlations. This study underlines the importance of data quality and preprocessing for reliable machine learning models. To enhance transparency and reproducibility, a GitHub repository with all the used files is made available online.

Revision as of 14:42, 6 June 2024 (view source) JSanchez (talk \| contribs) (Created blank page)	Revision as of 14:42, 6 June 2024 (view source) JSanchez (talk \| contribs) Newer edit →
Line 1:	Line 1:
	+
	+	==Abstract==

	+	Non-linear soil behaviour adds complexity in accurate parameter selection for numerical modelling. One of these parameters is the small-strain shear stiffness. This parameter depends strongly on the soil mass density and the shear wave velocity; the latter can be determined through in-situ tests or laboratory tests. The paper focuses on training various machine learning models to predict shear wave velocity estimates based on raw data from cone penetration test soundings. Three decision tree algorithms are considered for the analysis: XGBRegressor, HistGradientRegressor, and RandomForest. Various data preprocessing approaches are investigated, including noise removal and outlier identification, to assess their impact on the model performance. The results indicate that different data preprocessing approaches yield significant differences in the model performances. When applied to unseen raw data from a sand site of the Norwegian GeoTest Site, the model demonstrates promising predictive capabilities and is in a good agreement with well-known correlations. This study underlines the importance of data quality and preprocessing for reliable machine learning models. To enhance transparency and reproducibility, a GitHub repository with all the used files is made available online.

Revision as of 14:42, 6 June 2024

Abstract

Document information

Document Score

Share this document

Keywords

claim authorship