ISFOG2020 data science challenge

Data science techniques are rapidly transforming businesses in a broad range of sectors. While marketing and social applications have received most attention to date, geotechnical engineering can also benefit from data science tools which are now readily available.

There is a lot of buzz around machine learning and AI. Some posts are claiming it will completely transform the way we do work. But what I had not yet seen was an in-depth discussion between members of the community on how we can benefit from these methods and how they can be applied in practice. Before believing that data science will fundamentally change my job, I had to see for myself what machine learning algorithms are capable of and how they can be put to good use.

In August 2020, the 4th International Symposium Frontiers in Offshore Geotechnics will take place in Austin, TX. Since data science is clearly a frontier area, the organising committee believed it would be a good idea to launch a community-driven prediction exercise. The example chosen for this is a regression problem where the number of hammer blows required for pile driving needs to be predicted. The machine learning algorithm is trained on a number of locations where the observed blowcount is available and is then used to make predictions on unseen data.

Image source: Cathie Group

The competition is hosted on Kaggle and is open for anyone, including novice users of data science techniques. A cleaned dataset is provided and a tutorial using linear regression modelling is made available to introduce the concepts and get people started quickly.

The most important aspect of this exercise is to get discussion going on the following questions:

·     How confident can we be of our predictions when using machine learning models?

·     How well do machine learning models perform when trained with limited data (which is often the case in offshore geotechnical engineering)

·     Are these models just black boxes or can they also be used to learn more about the physics of the problem?

·     Can we somehow feed our engineering knowledge into these models?

The competition attracted broad interest from motivated individuals around the globe. Initial results look promising, with the best predictions still making use of the engineering knowledge on pile driving which was developed before machine learning toolboxes were readily available.

Example of observed (blue) and predicted blowcount (orange)

The results of the competition will be presented at the ISFOG2020 conference in Austin, TX.

Have fun predicting!

Bruno

About Snakesonabrain

Working on the crossroads of engineering, software development and data science. I love efficiency and clean, transparant calculations. I fell in love with Python and like to share knowledge, so have a look around and let me know what you think.

Leave a Reply

Your email address will not be published. Required fields are marked *