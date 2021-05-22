This article describes an optimization method concerning entropy encoding applicable to a source of independent and identically-distributed random variables. The algorithm can be explained with the following example: let us take a source of i.i.d. random variables X with discrete uniform distribution and cardinality 10. With this source, we generate messages of length 1000 which will be encoded in base 10. We call XG the set, of dimension 10^1000, containing all messages that can be generated from the source. According to Shannon’s first theorem, if the average entropy of X, calculated on the set XG, is 𝐻(𝑋)≈0.9980, the average length of the encoded messages will be 1000∙𝐻(𝑋)=998. Now, we increase the length of the message by one and calculate the average entropy concerning the 10% of the sequences of length 1001 having less entropy. We call this set, of dimensions 10^1000, XG10. The average entropy of X10, calculated on the XG10 set, is 𝐻(𝑋10)≈0.9964, consequently, the average length of the encoded messages will be 1001∙𝐻(𝑋10)=997.4. Now, we make the difference between the average length of the encoded sequences belonging to the two sets (XG and XG10) 998−997.4=0.6. Therefore, if we use the XG10 set, we reduce the average length of the encoded message by 0.6 values in base ten. Consequently, the average information per symbol becomes (1001∙0.9964)/1000=0.9974, which turns out to be less than the average entropy of X H(X)≈0.998. We can use the XG10 set instead of the XG set, because we can create a biunivocal correspondence between all the possible sequences generated by our source, which we know to be 10^1000, and ten percent of the sequences with less entropy of the messages having length 1001, in fact, 10^1001∙0.1=10^1000. In this article, we will show that this transformation can be performed by applying random variations on the sequences generated by the source.