When do neural nets outperform boosted trees on tabular data?

Otherwise, tree ensembles continue to outperform neural networks. The decision tree in the figure shows the winner among the top five methods.

Now, the background:

I explored the why of this question before, but didn’t get very far. This may be expected, given the black-box and data-driven nature of these methods.

This is another study, this time testing larger tabular datasets. By comparing 19 methods on 176 datasets, this paper shows that ๐—ณ๐—ผ๐—ฟ ๐—ฎ ๐—น๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—ป๐˜‚๐—บ๐—ฏ๐—ฒ๐—ฟ ๐—ผ๐—ณ ๐—ฑ๐—ฎ๐˜๐—ฎ๐˜€๐—ฒ๐˜๐˜€, ๐—ฒ๐—ถ๐˜๐—ต๐—ฒ๐—ฟ ๐—ฎ ๐˜€๐—ถ๐—บ๐—ฝ๐—น๐—ฒ ๐—ฏ๐—ฎ๐˜€๐—ฒ๐—น๐—ถ๐—ป๐—ฒ ๐—บ๐—ฒ๐˜๐—ต๐—ผ๐—ฑ ๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐˜€ ๐—ฎ๐˜€ ๐˜„๐—ฒ๐—น๐—น ๐—ฎ๐˜€ ๐—ฎ๐—ป๐˜† ๐—ผ๐˜๐—ต๐—ฒ๐—ฟ ๐—บ๐—ฒ๐˜๐—ต๐—ผ๐—ฑ, ๐—ผ๐—ฟ ๐—ฏ๐—ฎ๐˜€๐—ถ๐—ฐ ๐—ต๐˜†๐—ฝ๐—ฒ๐—ฟ๐—ฝ๐—ฎ๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜๐—ฒ๐—ฟ ๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด ๐—ผ๐—ป ๐—ฎ ๐˜๐—ฟ๐—ฒ๐—ฒ-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ฒ๐—ป๐˜€๐—ฒ๐—บ๐—ฏ๐—น๐—ฒ ๐—บ๐—ฒ๐˜๐—ต๐—ผ๐—ฑ ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ผ๐˜ƒ๐—ฒ๐˜€ ๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—บ๐—ผ๐—ฟ๐—ฒ ๐˜๐—ต๐—ฎ๐—ป ๐—ฐ๐—ต๐—ผ๐—ผ๐˜€๐—ถ๐—ป๐—ด ๐˜๐—ต๐—ฒ ๐—ฏ๐—ฒ๐˜€๐˜ ๐—ฎ๐—น๐—ด๐—ผ๐—ฟ๐—ถ๐˜๐—ต๐—บ.

This project also comes with a great resource. This time it comes with a ready-to-use codebase and testbed along with the paper.

Source