Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Learning from Discord’s Approach — Request Coalescing with Golang

Sign upSign InSign upSign InMohammad Hoseini RadFollowITNEXT--ListenShareAs you might have seen previously, Discord published a valuable article last year discussing how they successfully managed to store trillions of messages. While there are numerous Youtube videos and articles about this article, I think one section of this article, titled “Data Services Serving Data,” didn’t receive enough attention. In this article, we discuss Discord’s approach to data services and explore how we can leverage Golang’s concurrency features to reduce Database load in certain scenarios.As you know, messaging and channels are the most used components of Discord. Let’s imagine a scenario where an admin of a channel with 500k members mentions @everyone. What would happen? Thousands of simultaneous requests direct to that database partition, all aiming to retrieve the same message. This pattern repeats until the partition can no longer respond to other requests.Discord introduced an intermediary service that sits between Python API and the database cluster — which they call data service. This service contains roughly one gRPC endpoint per query without any business logic. The big feature that this service has for Discord is Request coalescing.As we discussed before, numerous similar requests direct to the database partition whenever there is a mention in a huge channel. By coalescing the requests, if multiple users are requesting the same row of the database, we can merge these requests in only one select query and run that instead.By having a data service instead of connecting directly to the database, we can implement many exciting features, such as bulk queries, that can reduce the database overhead significantly and improve the mean and especially the 99th percentile of the queries.Like numerous other companies, Discord uses Python as its primary backend language. Whether a microservice or a monolith, backend services are usually directly connected to a data source for making queries. While Python is indeed a versatile language, it falls short in concurrency. Implementing concurrent and high-throughput services with Python can be somewhat challenging, and the performance, compared to similar services written in compiled languages such as C++, Rust, and Golang, tends to be lower.Before doing anything, let’s simulate the mentioned situation. Let’s imagine the service receives a total of 5k requests with 1k concurrency.I built a simple database model for the Message table with Gorm and then filled the table with 100 dummy messages.I made a simple endpoint for simulating a SELECT query for a random id between 0 and 100. Now we can benchmark this endpoint to simulate what would happen in this scenario:And if we had the 10 seconds timeout policy, around 2% of the request would not get a response. Now let’s change the code. Golang has a built-in package called “single flight.” This package provides a duplicate function call suppression mechanism. In general, you give it a key and a function, and instead of running that function multiple times, SingleFlight holds other calls until the first call has completed its request and responds with the same result.func (g *Group) Do(key string, fn func() (interface{}, error)) (v interface{}, err error, shared bool)Do executes and returns the results of the given function, making sure that only one execution is in-flight for a given key at a time. If a duplicate comes in, the duplicate caller waits for the original to complete and receives the same results. The return value shared indicates whether v was given to multiple callers.Now let’s rerun the simulation and compare the results.As you can see, just using a simple technique decreased the 99th percentile by 14 seconds, and the new approach supported 7.6 times more requests per second.We have been using the data services in my company for around three years, and we have noticed since then that there is a lot of potential to improve the overall performance of the application by just optimizing database queries. While the approach we discussed is situational, Discord has been using it for over a year and helped them a lot.You should be aware that if you use data services, you will face other complications. For instance, you have multiple data service instances, and your Python API must have a mechanism to send similar requests to the same instance.----ITNEXTI've been working as a software engineer for 5 years. I love Go, Python, and building rebust and scalable apps with Kafka, Redis, and Kubernetes.Mohammad Hoseini RadinITNEXT--5Carlos ArguellesinITNEXT--30Juntao QiuinITNEXT--10Mohammad Hoseini RadinITNEXT--3Roaa 🦄💙inFlutter Community--5Andy Walker--6Matthias Bruns--2The Coding DiariesinThe Coding Diaries--133Gabi Sual--Igor Carvalho--HelpStatusWritersBlogCareersPrivacyTermsAboutText to speechTeams



This post first appeared on VedVyas Articles, please read the originial post: here

Share the post

Learning from Discord’s Approach — Request Coalescing with Golang

×

Subscribe to Vedvyas Articles

Get updates delivered right to your inbox!

Thank you for your subscription

×