- Unary encoding, which is the encoding of the number
*n*by the string containing*n*ones, satisfies that numeric sorting and lexicographic sorting coincide for it. If we concatenate a "0" suffix to all strings under this encoding, we add the further property that no two of the resulting strings are prefixes of each other. - If
*E*(*n*) is an encoding scheme for*n*where numeric and lexicographic ordering coincide, and if the encoding of no two numbers under*E*are prefixes of each other, then*E*(len(str(*n*)))+str(*n*) is another such encoding. (where "str(*n*)" is the standard decimal representation of*n*, "len" is the digit length of a string and "+" is string concatenation).

def encode(n): return "1"*len(str(n))+"0"+str(n) def decode(s): return int(s[s.index("0")+1:])

However, these functions still don't provide an adequate solution, because
the "length" component, encoded in unary, is of order O(log(*n*)), meaning
that *R*(*n*) is O(log(*n*)), which is still too large. To
solve the problem completely, we will re-utilize the second encoding idea,
this time using the functions above as *E*(*n*):

def encode(n): return "1"*len(str(len(str(n))))+"0"+str(len(str(n)))+str(n) def decode(s): return int(s[2*s.index("0")+1:])

This reduces *R*(*n*) to O(log(log(*n*))), and solves the
problem.

For the bonus part, we want to prove that this complexity is the theoretical minimum. What follows is a proof that shows this for binary digit strings, but it is straightforward to extend it to any base, including to decimal representation.

Let us
consider how many integers we can encode before we require one where
*R*(*n*)>*c*, for some integer *c*.

Suppose all the numbers up to a certain digit length *l* can be encoded
with *R*(*n*)≤*c*. Consider an arbitrary *k*≤*l*.

There are 2^{k-1} numbers of length exactly *k* in standard
binary representation, all of which
must be represented by strings of length between 0 and
*k*+*c*. Consider the prefixes of length *c*+2 of these
strings (where strings of length less than *c*+2 are their own prefixes).
There are at most 2^{k-1}-1 such strings that share
any given prefix, for which reason at least two distinct prefixes must be
used for the encoding of the numbers of length *k*.

Of these prefixes, none but the lexicographically largest can be
the prefix of the encoding of any larger number. Therefore, each length
*k* can be said to "exhaust" at least one prefix. There are only
2^{c+3}-1 such prefixes, so after the first
2^{c+3}-1 lengths they will all be exhausted. *l* must be
smaller than 2^{c+3}. The first *n* that requires
*R*(*n*)>*c* is less than 2^{2^(c+3)}.
Changing the
formula around, for any *n* there is at least one *m*≤*n*,
such that *R*(*m*)≥log_{2}(log_{2}(*n*))-3,
which is equal to log_{2}(log_{256}(*n*)), and therefore
*R*(*n*)≥O(log(log(*n*)).

Q.E.D.