Resumen
We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within ??(??(1+lg??))???(??lg??)
O
(
n
(
1
+
lg
a
)
)
?
O
(
n
lg
n
)
, where the alternation ???[1..??-1]
a
?
[
1
.
.
n
-
1
]
approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation ??
a
. Such results refine the state of the art complexity of T(??lg??)
T
(
n
lg
n
)
in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen?s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.?s deferred data structure to partially sort a multiset accordingly to the queries performed on it (known since 1988). Preliminary experimental results on text compression by words show ??
a
to be polynomially smaller than n, which suggests improvements by at most a constant multiplicative factor in the running time for such applications.