Renyi entropy

9/1/2023

I can see some vague link to collisions, as you’re taking the square of each probability, so they’re something like interaction terms.

It’s called the collision entropy, and because it squares the probabilities it ends up weighting the most probable events more highly than the Shannon entropy would. The \(\alpha =2\) case is an interesting one. The \(\alpha \rightarrow \infty\) min-entropy only cares about the highest probability event and ignores the rest. The \(\alpha \rightarrow 0\) max-entropy case weights everything nonzero equally. The limiting case for \(\alpha \rightarrow 1\) turns out to be the Shannon entropy.Īs ever, people mostly just care about \(0\), \(1\), \(2\) and \(\infty\). \[H_\alpha = \frac \|P\|_\alpha\).Įither way the formula messes up for \(\alpha = 1\). , p_n)\) the Rényi entropy \(H_\alpha\) can be written For a vector of probabilities \(P = (p_1, p_2. More precisely, the Rényi entropy is related to the \(p\)-norm (actually let’s call it the \(\alpha\)-norm as we already have \(p\) for probability knocking around). They depend on a free parameter \(p\), and roughly speaking the larger \(p\) is the more weight it gives to more probable states. The Rényi entropies all have this same Schur concave behaviour. Functions of this sort are called Schur concave. If vector \(x\) majorises vector \(y\), it’s more concentrated and should have lower entropy - writing the entropy as \(H\), you want \(H(x) < H(y)\).

Whereas \(\) is majorised by everything - it’s the slowest way to get to 1. \(\) majorises the others because each partial sum is higher than for the other vectors - it gets to 1 immediately and stays there. If all the partial sums of one vector \(x\) are bigger than another vector \(y\), we say that \(x\) majorises \(y\). For the three example vectors so far this gives: The Rényi entropies all use a measure of concentration called majorisation, which works in the following way:įirst, reorder the elements of the vector in descending order. At the other end of the scale, \(\) is fully concentrated on one state and will have a low entropy. For example, \(\) is maximally spread out over the four states and has the maximum entropy. The probabilities of being in each state are labelled, for example, \(\).Įntropy is connected to how ‘spread out’ these probabilities are. For simplicity I’m going to use an example of a system with four states. The Rényi entropies are a family of entropy measures that includes the well-known Shannon entropy along with a bunch of other more obscure ones that sometimes crop up. (I got interested in this when a funny entropy measure, the collision entropy, popped up in a paper I was reading.)

0 Comments

Renyi entropy

Leave a Reply.

Author

Archives

Categories