July 24th 2023

Posted on Jul 23 Note: I've added table of contents to previous installments so they'll hopefully be easier to navigate. Thanks to derlin for the nifty TOC generation tool!So far we've seen the basic communication pattern for networking and the three low level protocols that make up the core of Internet communication. Now we're going to look at how servers operate. Servers are anything from the Bulletin Board Systems (BBS) from back in the days, to modern web servers hosting millions of clients. Covered in this article will be a simple python Server and slowly add more functionality to how it serves data.The code here does not guard against malicious attacks done via manipulating how client data is sent. It's only meant to show the basics of how each type of server works. If you're working with a public facing service you should really have a reverse proxy and even a firewall in front of it to handle such attacks. I generally prefer doing it at that layer since it's easier to handle network hardening in easy to update software than trying to handle it across who knows how many codebases. So basically:Don't use any of this in productionMost servers have a workflow of:So we'll start with an echo server that simply replies back to the client with what it was sent. Here is some example code from the python documentation:And the client:The results are:Before continuing with this I'd like to take a moment to discuss port bind permissions.One interesting thing to note is per the IANA well known ports listing there is actually a specific port number 7 which is designated for an echo server. If I try to bind this in Windows as a non-privileged user:It happily complies with the request (though you may need a one time windows firewall exception). Linux on the other hand:The port bind gets rejected. This will occur on most any *NIX like system. Now we could just run it as root to solve the problem:But in general running services as root is not really desired since if someone manages to exploit the server they could potentially have full control over the system. To get around this we can utilize os.setuid and os.setgid. The code then becomes something like this:This will drop permissions to a specific user, with "nobody" and "nogroup" by default. The pwd.getpwnam call obtains the entry for the user in the UNIX password database, (most of the time will be /etc/passwd) and grp.getgrnam does the same for the UNIX group database (most of the time will be /etc/group). After running this we can see the port is bound, but the process is running as nobody:umask is related to permissions for files and directories created by the process. The 007 I have set allows user and group to have full access to the files, while all other users are blocked from access. This means I could change the process group to something like "serveradmin" and users in those groups would be able to interact with the server's files. Alex Juarez has a good article on permissions in general. This Stack Overflow answer also has an interesting look at the nuances of how umask operates.Now the problem with the existing server is it exits right away and only handles one connection. This functionality is not practical for something like a web server which needs to constantly serve clients. Now we could make some modifications to have it continually serve connections:But it's fairly low level not very extendable as-is. Thankfully python has the socket server module to provide abstraction in setting up a server. The python docs also have an example for a socket server:While the server creation is still somewhat more imperative in nature, client connections are now handled via an object which inherits off socketserver.BaseRequestHandler. This requires the implementing class to define a handle() method, which for TCP will expose self.request to hold a socket referencing the connection. Now to show multiple connections working I'll utilize the Apache HTTP server benchmarking tool. This is easily available in Ubuntu via sudo apt-get install apache2-utils:This will make several short HTTP requests to the server (I've adjusted the code to bind to the proper IP address). 20 requests will be executed to the server which we can see the result of here:Being a benchmarking tool we also get some nice statistics, mostly:This is compared to the simple server infinite loop version:Now while the code layout has improved there's still the issue of handling multiple clients at once. One interesting way to handle this is to separate the socket acceptance and the actual client handler out.Threads is one way of looking at this issue. The short story is it makes threading not as performant as a language without it using native thread. The long story is another full article. Socketserver has a thread server wrap around to help with this:While you do have to deal with the GIL, it's really not that bad for a basic sized server.When you run a server it gets a identified by the system as a process. A process can in turn run another process (the top of the chain is the init process for most operating systems). These are often known as child processes and the process that spawned them is the parent process. The multiprocessing module in the python standard library is able to manage such child processes. Using this method a client is attached to a process for handling:So this will create 20 worker processes which will be listening for connections to the main server (yes, you can have multiple accept() calls). Looking at the processes:Indeed we see that there are 21 python processes, the main parent process and the 20 worker processes. Now the issue here is that while we've split the workload up each worker process is still bound to finishing the client communication before moving on to the next. What if we could remove some of the barriers in waiting for client communication?It turns out that IO has a concept of blocking and non-blocking. Socket communication by default is blocking, meaning you have to wait for work like receiving data from a connection to be done before moving on. To get around this we can set socket communication to non-blocking via socket.setblocking. This means the usual socket methods will return right away. Unfortunately this has two inherent issues with a standard setup:To work around this there are several calls that deal with a connection being available which are supported by the selector module. By using DefaultSelector the most optimal for your operating system is chosen. As an example:Now behind the scenes selector has a few options available as to how it's doing things. In the end though the process is:Which is pretty much how the loop goes. The main ways you'll generally deal with this are: select(), poll(), and epoll(). All of these have their own Selector() implementation. Using DefaultSelector generally picks the most optimal. In general, select is not quite the best performant due to the limit it has of 1024 sockets it can check (though it does work on Windows). poll() is an enhanced version, while still keeping somewhat portable. Both select() and poll() are essentially keeping a list of sockets to look at and going through them each time. epoll() on the other hand is more reactive instead allowing the ability to handle a large amount of sockets more efficiently than select() and poll(). That said, it's only available on Linux which limits portability (not a huge issue given how easy it is to get a Linux server these days). Handling a large number of connections efficiently is often referred to as the C10k problem (or some variant of k). Looking at the code now:Here we have a normal socket bind and listen for the server. The server's socket is set to be non-blocking and registered into the list of sockets were interested in.Now we have a main event loop. For the server socket the data property is set to none. If this is the case we run the client socket accept handler. Otherwise we're dealing with an existing connection that needs to be handled.This will accept our connection and also set it to non-blocking. The next thing it does is setup a SimpleNamespace which is nicely explained here. It will be attached to the socket as a way to keep state when dealing with it. This allows for interaction between readers and writers. outb is set to a BytesIO type which is very performant when working with byte concatenation, which we'll be doing to keep track of data read in.Now is the interesting part. The code will check if this is a read or write event. By default, the only thing that's being checked is if a socket is ready for reading. When everything is done we need to echo back so we switch writing mode. Then on the writing side we simply send all the data we have gathered back and close and remove the socket from the list of sockets we're interested in. The epoll() version gives a nice count on requests per second:You can force a specific selector by changing DefaultSelector to:I will say that this article to me is mostly showing different server types. If you're really in need of true performance it might be better to consider a language built for that ( such as GoLang, especially since it has emphasis on networking ) or have dedicated software that deals with all the nuances of network communication. In fact, most of the time you won't need to deal with this much in the modern cloud computing world. Load balancers, containerized microservices, and many managed services handle much of this for you. If you really just want to work with one to test things out the blocking threaded socketserver is good enough in my opinion. Now that we've seen different types of servers the next installment will be looking at a specialized type of server: HTTP.Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well Confirm For further actions, you may consider blocking this person and/or reporting abuse Simon Green - Jun 25 Shawn Smith - Jun 15 Nevo David - Jul 10 Abhishek Kumar - Jul 14 Once suspended, cwprogram will not be able to comment or publish posts until their suspension is removed. Once unsuspended, cwprogram will be able to comment and publish posts again. Once unpublished, all posts by cwprogram will become hidden and only accessible to themselves. If cwprogram is not suspended, they can still re-publish their posts from their dashboard. Note: Once unpublished, this post will become invisible to the public and only accessible to Chris White. They can still re-publish the post if they are not suspended. Thanks for keeping DEV Community safe. Here is what you can do to flag cwprogram: cwprogram consistently posts content that violates DEV Community's code of conduct because it is harassing, offensive or spammy. Unflagging cwprogram will restore default visibility to their posts. DEV Community — A constructive and inclusive social network for software developers. With you every step of your journey. Built on Forem — the open source software that powers DEV and other inclusive communities.Made with love and Ruby on Rails. DEV Community © 2016 - 2023. We're a place where coders share, stay up-to-date and grow their careers.

This post first appeared on VedVyas Articles, please read the originial post: here

People also like

The Ultimate Guide to Cloud Gaming: Discover the Best Services

Python Networking: Servers

Related Articles

Python Networking: Servers

Related Articles

Share the post

Subscribe to Vedvyas Articles

Thank you for your subscription