CS 840 Assignment 1
Due
Instructor:
1.
Consider
the idea of using a hash function in the van Emde
Boas “priority queue” to reduce the space requirements. Assume that hashing a
key (or any other value) to a location takes O(1) time
and space for the hash table is
proportional to the size of the data entries and keys,. Take the scheme for
hashing as a “black box”. How can you use such hashing to implement the van Emde Boas method efficiently?
a.
Outline
your scheme, perhaps by showing how it differs from the standard version.
b.
Assuming
the hashing works in constant time for an insert,
delete or search, how much time is taken for each of the operations: insert,
delete and successor.
c.
What is
the space requirement of your method? Assume you have a universe of u elements
and a word size of lg u bits. n
elements are currently stored in the data structure. Clearly this should depend
on u as well as n.
2.
We have
discussed, in great detail, the Move to Front heuristic for improving linear
search. This question deals with the very simple notion of counting accesses to
elements and maintaining the list in order by Frequency Count (FC).
a.
Outline
a method for keeping frequency counts and maintaining the elements of the list
in decreasing order by their counts. This method should require only O(1) time to update a count and to restore and to move such
an updated element to its appropriate place in the list.
b.
If
elements are drawn from a fixed (and independent) distribution, it can be shown
that the expected behaviour of FC is close to that of
the optimal. You don’t have to prove this. We will concentrate on the amortized
(worst case) behaviour, and compare FC with (i) the static optimal (Sopt)
which keeps elements in decreasing order of frequency of the entire sequence
and (ii) the dynamic offline optimal.
i.
Prove
that the amortized case behaviour of the FC heuristic
is within a constant factor of that of Sopt. (As in
class, assume FC starts with an empty list and adds an element to the end of
the list when it is first accessed)
ii.
Consider
the model of Sleator and Tarjan
in which a list may be rearranged by moving the requested element toward the
front of the list at no charge, or swapping two adjacent elements at a cost of
1. Prove that under this model, FC is not necessarily within a constant factor
of the off line optimal on an amortized basis.
3.
We
briefly discussed an approximation scheme for the optimal binary search tree.
The root of a tree on a given range is node xi, such that the
maximum of the probability of a search in the range either before xi or after xi is minimized (break ties
arbitrarily). This question deals with the time and space required to compute
this tree.
a.
Sketch
an efficient algorithm for performing the task. Your method should take O(n lg n) time. (Hint: computing
the cumulative probabilities Pi=
SUMj<=i
pi +qi in a preprocessing phase
will help a lot)
b.
Improve
the method to take linear time. (OK … if
you just do part b) you really don’t have to do a)) (Hint: the amount you are
willing to spend on finding the root in range will have to depend on where the
root actually is .. be willing to pay more for an “even split” than for a
really skewed one))
c.
An
easier version of b): Rather than recursively finding an element in the
“middle” of each subrange we take a more global
approach. The root of the entire tree is the one that straddles the cumulative
.5 mark (or comes as close as possible) as before. The children of the root are
now at the .25 and .75 marks respectively, and their
children at .125, .375, .625 and .875. Children at lower levels are assigned in
a similar manner. How can you determine this tree in linear time?