Idea: Use a linked list to store elements but, every time an
element is sought, move it to the front of the list.
Intuition: Frequently-sought, or high-probability, elements will
tend to stay near the front of the list.
For example:
 
Searching in a Move-to-Front (MTF) list costs only a few pointer
assignments more than searching in a normal list.  It's also very
easy to implement.
But is the move-to-front heuristic really useful?  We must somehow
measure the quality of this heuristic.  To do so, we'll
compare the expected search time in a MTF list with that in an optimal
list of the same elements.
Note that we can almost never construct the optimal list, since the
probability distribution of queries is almost always unknown.  So
we're hoping that the MTF heuristic gives an expected search time
close to that of the optimal list.
 
Given 
 elements, name them by their position in the optimal list:
 elements, name them by their position in the optimal list:
 .
.
Then (from the optimal list theorem) 
 .
.
Let 
 be the expected search time in the optimal list of
these elements.
 be the expected search time in the optimal list of
these elements.
Let 
 be the expected search time in the MTF list of the
same elements.
 be the expected search time in the MTF list of the
same elements.
Clearly, 
 since the optimal list has
minimum expected search time, of all possible lists.
 since the optimal list has
minimum expected search time, of all possible lists.
Theorem
 
Discussion
The theorem says that the expected search time of the MTF list is
at most twice optimal.  This is a surprising result, since we don't
even know the probability distribution!
Consider, for example, these elements and their associated probabilities:
| x | A | B | C | D | 
| P( x ) | 0.1 | 0.1 | 0.8 | 0.0 | 
 .
.
The theorem states that 
 .
.  
Since we don't know 
 , we might have built an arbitrary,
static list, like
, we might have built an arbitrary,
static list, like
 
 
Had we done so, we would have been stuck forever with an expected
search time of 
 .  Clearly, the MTF list is better, even
though it's not optimal.  When we don't know the probability
distribution, it's best to use an MTF list rather than risk choosing a
bad ordering in a static list.
.  Clearly, the MTF list is better, even
though it's not optimal.  When we don't know the probability
distribution, it's best to use an MTF list rather than risk choosing a
bad ordering in a static list.
 
Proof of Theorem
 
In the MTF list of these elements, let's perform enough searches so
that the ordering of the MTF list is unaffected by its initial
ordering.  Each element will have been sought at least once.
We'll take a ``snapshot'' of the list at this point and will
analyse the expected search time.
Let 
 be the position of
 be the position of 
 in this snapshot.  The expected
value of
 in this snapshot.  The expected
value of 
 is (from our definition of expectation)
 is (from our definition of expectation)
 
For example, suppose we have
|   | percentage of time | 
| 1 | 50 % | 
| 2 | 10 % | 
| 3 | 20 % | 
| 4 | 20 % | 
Then 
 
and 
 is, on average, in position 2.1.
 is, on average, in position 2.1.
 
If we knew 
 , we would know
, we would know 
 , since
, since 
 .
.
So what is 
 ?
?
 
In the summation above, if 
 precedes
 precedes 
 , it adds
one to the number of preceding elements.
, it adds
one to the number of preceding elements.
So what is 
 ?
?
Generalising from this example, 
 .
.
In the example above, 
 .
.
Thus
 
Note the symmetry in i and j !
 
Therefore 
 .
.
Note that 
 .
.
Using this fact:
 .
.
That's it!  We've proven that 
 .
.
Note that we had to get the 
 in
order to exploit the symmetry.
 in
order to exploit the symmetry.
The key was to produce the 
 term.
 term.
It's great that we can prove the the MTF heuristic is
actually useful.