Nicolas MARTIN
Apr 20, 2021

--

Thanks a lot Tivadar Danka for this article.

Softmax shouldn't be regarded as a probability distribution anymore.

My guess is that it was initially inspired by statistical physics but without considering the energy behaviour that allows effective differentiation.

A recent parper apply a new kind of softmax using energy based functions (=more natural) and have very good results, but in out-of-distribution context. Do you think we should replace Softmax by this new energy-based function?

https://arxiv.org/pdf/2010.03759.pdf

--

--

Nicolas MARTIN

Full Stack Data Scientist. Topics: Deep learning, mathematics, manufacturing engineering, history. Creator of https://www.airoomstyles.com