Creating a new short URL service
This website features the ability to receive notifications over SMS. If you enable it in account settings, you’ll get updates like these:
SMS has space constraints and short URLs are a necessity. Short URL services became popular about a decade ago for use on Twitter, which then had a 140 character limit to fit into SMS’s 160 char limit (20 chars reserved for username and other metadata). Twitter later introduced its own
t.co
service, forcing others to either pivot to offering marketing analytics, or fall into disuse.When we came looking for short URL services in 2020, we found we had two options:
- Overpriced commercial services. Bitly has a minimum plan of US $29 for 1500 branded links, which works out to ₹1.43 per link, far more than the cost of the SMS itself. Others are similarly priced.
- Free services with no guarantees. We tried a few across some months before giving up.
So we built our own. In the process, we learnt that we had two distinct requirements.
A second type of link
A typical message links to public content with a shareable link, and sharing is good.
However, commercial SMSes in India come from alphanumeric sender ids that you can’t reply to. There is no way to communicate with the sender to change your preferences. It felt right to include a preferences link in every message. This type of link has some distinct characteristics from a regular link:
- This link is private. It should not be shared, and it should be obvious that it’s not meant to be shared.
- It should expire, to reduce the risk from accidental sharing.
- Some SMS apps have link previews. This preview should not cause an action. It should require your explicit confirmation.
- The link should be sufficiently random as to be unguessable. You should not be able to set preferences for other people by guessing their links.
It helps if this type of link has a distinct identity. We arrived at these two domains:
has.gy
-- the main domain for short linksbye.li
-- the “bye link” for unsubscribing or changing preferences.
Making them short
A URL constructed with either of these domains looks like this: https://bye.li/ABCD. This is 19 characters long. With two links in every message, that’s 38 characters taken, leaving a little over 100 for text content. While joined SMSes can have up to 2000 characters, current telecom regulation in India allows a usable space of only about 150 characters, a topic for another update. Can we cut the size any further? Maybe:
- Switch from
https://
tohttp://
, saving a char. Combined with HSTS preloading, recipients will still get the benefit of secure links. - Drop the
https://
prefix entirely (8 chars!), but this is risky. Google Messages on Android automatically recognizes links with both domains, but iOS 14 Messages only recognizesbye.li
links.has.gy
still needs the prefix. Other SMS apps on Android may fail to recognize both domains. - Get a shorter domain, like Twitter’s
t.co
, except 💸 - Reduce the number of characters in the “back half”, the
ABCD
in the above example.
Choosing characters in a URL
A URL can contain most of the characters you see on your keyboard, but not all. Some – like the question mark
?
and ampersand&
– have special meanings. Others – like round brackets()
– are allowed, but some apps will get confused if they appear at the end of a URL. The ecosystem has arrived at a smaller subset of characters that are considered completely safe: alphabetsA-Z
in upper and lower case, digits0-9
, and the hyphen-
and underscore_
. This is a total of 64 characters, and formally defined as the URL-safe Base64 alphabet.The length of the “back half” will then determine how many unique combinations are possible – and therefore unique links:
Length Formula Count 1 641 64 2 642 4096 3 643 262,144 4 644 16,777,216 5 645 1,073,741,824 As you can see, this grows at an exponential rate. The links we’re generating are meant to expire. If we use 4 characters and generate a few thousand links every week -- with an expiry window of one week -- it’ll be very hard for you to stumble on someone else’s preferences link by guessing it. Should our volumes grow to hundreds of thousands every week, adding just one more character will blow up the available space to a billion.
The actual process of generating and handling a link is fairly straightforward. We make an associative array (dict in Python) containing the recipient, the notification type and event, and a timestamp, and post it to a Redis cache with a TTL of 14 days. The actual validity period is only 7 days in the current implementation, but the longer TTL ensures the same link isn’t reassigned to someone else. Link back halves are generated using a random number generator, which typically gives them wide distribution across the available space. However, as with any true random generator, numbers get reused every once in a while and must be checked against the used pool.
In the next update we’ll discuss how we approached non-expiring short links. Got a question or a comment? Head over to the Comments section and post. We don’t have comments directly on updates yet, but maybe we should.