Build our own discord
Oh hello again! Nice to see you come by!
So far, we’ve gone through a lot of stuff. But at the same time, we still have a lot to learn!
In the first chapters, I’ve mentioned that working with stateful requests is not scalable. But sometimes, you don’t have a good choice.
Now, I still stand by that point - stateful architecture is weird. But how would you create a chat service without one? How would we be able to receive info from backend if the backend can’t push anything to us?
Now, bear in mind this is different than notifications - those are using different tools than our own APP. SMS, email, all of these are something different than simply receiving messages from the server.
So, how would we create a chat system?
We have quite a bunch of different chat apps. Furthermore, we have a bunch of different types of chats! Consider the following:
So, to know how to approach something, what are the requirements for this app? Let’s assume the following:
So, those are our requirements.
So, what are the basics for a high level design?
Well, basically:
Now, in here, we’ll explore another network protocol. So far, we’ve been working with HTTP.
However, if we’d read the definition more closely:
A typical flow over HTTP involves a client machine making a request to a server, which then sends a response message
Now, that’s important! A client requests to a server! We don’t want that with a chat application.
Or we might! If we’d want that, we’d could go with polling
Now, those are very vague descriptions. Let me quote the SO post:
Short polling:
00:00:00 C-> Is the cake ready?
00:00:01 S-> No, wait.
00:00:01 C-> Is the cake ready?
00:00:02 S-> No, wait.
00:00:02 C-> Is the cake ready?
00:00:03 S-> Yes. Have some lad.
00:00:03 C-> Is the other cake ready? ..
Long polling:
12:00 00:00:00 C-> Is the cake ready?
12:00 00:00:03 S-> Yes.Have some lad.
12:00 00:00:03 C-> Is the cake ready?
Specifically, note the timestamps.
But, let’s say we don’t want to use polling. Well, with HTTP, we have no other reasonable way to create a chat. In come WebSockets.
Now, with websockets, or ws for short, where only the initial call is HTTP, but all other ws calls are actually messages using a persistent connection. SignalR or Socket.IO are popular adopters.
So, we can finally consider our system:
Now, since there is no heavy lifting done on the regular servers, we COULD fit them all into one server. However, as was mentioned multiple times, single server means single point of failure. Therefore, as always, we’ll put them behind a load balancer.
Now, finally, it could look something like this:
All the data will be stored in key-value stores, as they are very solid choice for massive amounts of data.
Let’s hang with storage a little longer though. We’ve mentioned a lot of daily active users. Furthermore, we want it to be stored forever. Now, that’s a long time, isn’t it?
Well, let’s look at statistics. Messenger and WhatsApp process 60 billions messages per day. That’s A LOT. Imagine that you’re saving unicode characters (2 bytes in size) with that.
Now, of course, we’ve all used messages. We can easily send twenty 3-char messages per minute. But it’s a lot of space.
So, all in all, we have to account for A LOT of storage space. How would we do that?
Well, again, we need to know how the data will be handled. Looking at how I recently worked with chats:
Because of that, key value stores are a really solid choice!
So, finally, the data model could look something like:
{
message_id: "number";
message_from: "number";
message_to: "number";
content: "string";
created_at: "timestamp";
}
Now, in designing a unique ID generator, we’d probably reuse that. However, there can be quite a lot of messages created at one time with chats. Therefore, we’ll have ID that also works as sequence number.
With group messages, it’d be pretty much the same. The main difference here would be message_to
being a channel ID. It would look something like:
{
message_id: "number";
user_id: "number";
channel_id: "number";
content: "string";
created_at: "timestamp";
}
Finally, how’d the message ID look like?
I’ve mentioned multiple times that autoIncrement is insufficient because of scaling. However, this is a very specific case where it actually may be good enough! Kind of. Key value stores dont have autoincrement.
So, we’d use a local unique ID generation. Which could work very similar - we’d just have to generate it ourselves. Now, why would it work here?
Because users are already part of some channels. So the IDs need to be unique only within the context of the data. If you have a message between
user A and user B, then if the id
clashes with a message between user B and user C, it’s not an issue. You can still identify it!
Nonetheless, it is still possible to create our own unique distributed ID for this as well.
So, we’ve established how our storage works, what’s being used, how chats work. Now, one of the things I mentioned is that we are using websockets to connect to chat server. However, WHAT chat server? Remember, there will probably be a lot of them.
So, how would we connect 2 users to a specific server? Well, we’ll have to use what’s called a Service Discovery
.
Apache ZooKeeper is a popular open source that we can use. From description:
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
The flow would then look like:
So, provided that user is already on the chat server, the following happens:
Now, the potential problem here is synchronization between multiple devices. Imagine you’re on Facebook and you’re logged on your desktop and mobile simultaneously.
To achieve some synchronization in here. we might keep track of latest message ID. How’d that work?
Well, we’d have something like:
user_id
last_message_id
last_message_id
is lower than new message in KV store, then new messages are fetchedGroup chats are a little more complicated. We can view them as direct messages, except the message is sent to multiple users. So how would that work?
Well, first, we need to understand that when we sent a message, we basically send it to multiple users. So, we could take the design from before
Consider the previous design. What do we need to add a user? Well, we’ll just another link from message queue to another chat server.
We could do that, and that would work. However, if we want to be a little more scalable, we could add more message queues. We could have a message queue for each user connected to the chat server
That’d be beneficial for two reasons:
So, the final flow could be like:
sending messages
receiving messages
Now, the last problem we need to deal with is online presence. An indicator is often shown - a green dot if a user is online, for example.
How would that work? Well, let’s first think about how we could do that.
So, that’s a lot of requests. To not slow down our regular servers, let’s have some Presence services.
Finally, users would be subscribed to message queues on these servers, depending on whether those are group chats or not
So, here we are again! At the end of the road for chat service. Let’s look at what we’ve learned and how we’ve designed it. Going back:
So, how did we do?
So, we are missing one thing. So let’s have a quick look at that!
End to end encryption basically means - we as a provider can’t read messages. So, what would happen is we basically need to encrypt the data:
Now, with group chats, that’s a little more complicated. We have a lot of options.
One we could use is just PGP. The gist of it is:
And that’s it! We’ve created a chat system!