This answer is similar in spirit to Douglas Leeder’s, with the following changes:
- It doesn’t use actual Base64, so there’s no padding characters
-
Instead of converting the number first to a byte-string (base 256), it converts it directly to base 64, which has the advantage of letting you represent negative numbers using a sign character.
import string ALPHABET = string.ascii_uppercase + string.ascii_lowercase + \ string.digits + '-_' ALPHABET_REVERSE = dict((c, i) for (i, c) in enumerate(ALPHABET)) BASE = len(ALPHABET) SIGN_CHARACTER = '$' def num_encode(n): if n < 0: return SIGN_CHARACTER + num_encode(-n) s = [] while True: n, r = divmod(n, BASE) s.append(ALPHABET[r]) if n == 0: break return ''.join(reversed(s)) def num_decode(s): if s[0] == SIGN_CHARACTER: return -num_decode(s[1:]) n = 0 for c in s: n = n * BASE + ALPHABET_REVERSE[c] return n
>>> num_encode(0)
'A'
>>> num_encode(64)
'BA'
>>> num_encode(-(64**5-1))
'$_____'
A few side notes:
- You could (marginally) increase the human-readibility of the base-64 numbers by putting string.digits first in the alphabet (and making the sign character ‘-‘); I chose the order that I did based on Python’s urlsafe_b64encode.
- If you’re encoding a lot of negative numbers, you could increase the efficiency by using a sign bit or one’s/two’s complement instead of a sign character.
- You should be able to easily adapt this code to different bases by changing the alphabet, either to restrict it to only alphanumeric characters or to add additional “URL-safe” characters.
- I would recommend against using a representation other than base 10 in URIs in most cases—it adds complexity and makes debugging harder without significant savings compared to the overhead of HTTP—unless you’re going for something TinyURL-esque.