Definition
Wasserstein distance measures how far one probability distribution is from another by the minimum cost of “moving mass” from one shape to the other — geometry-aware, where KL divergence is not. It’s the metric of choice for many generative-modeling and distribution-matching problems.