Definition
A simple kind of two-way multiplication that depends on both inputs together.
A function linear in each of two arguments separately; in attention, queries and keys interact bilinearly via dot product, capturing pairwise structure that linear features alone cannot represent.