Some fundamental but important questions about web development?

I have developed some web applications so far using PHP, Python and Java. But some fundamental, but very important questions are still not in my power, so I made this post to get help and clarification from you guys.

Suppose I use some programming language as my internal language (PHP / Python / .Net / Java, etc.) and I deploy my application with a web server (apache / lighttpd / nginx / IIS, etc. .). And suppose that at time T, one of my pages received 100 simultaneous requests from different users. So my questions are:

  • How does my web server handle such 100 concurrent requests? Will the web server generate one process / thread for each request? (if so, process or thread?)
  • How does the backend interpreter work? How does it process the request and generate the correct html? Will the interpreter generate a process / thread for each request? (If so, process or thread?)
  • If the interpreter will generate a process / thread for each request, how about these processes (threads)? Do they share some code space? Will they communicate with each other? How to handle global variables in internal codes? Or are they independent processes (threads)? How long does the process / thread last? Will they be destroyed when processing the request and the response will be returned?
  • Suppose a web server can only support 100 concurrent requests, but now it has received 1000 concurrent requests. How does he deal with this situation? Will they process them as a queue and process the request when the server is available? Or other approaches?
  • These days I read some articles about comets. And I found that a long connection can be a good way to handle multi-user in real time. So what about a long connection? Is this a feature of some specific web servers or is it available for each web server? Will a long connection require a long interpretation process?

Thanks to everyone. These questions annoyed me very much. Therefore, I hope you can help. A more detailed answer will be greatly appreciated. And please add some links.

Sincerely.




EDIT: I recently read several articles on CGI and fastcgi, which makes me realize that the fastcgi approach should be the typical hanlde request approach.

The protocol multiplexes one transport connection between several independent FastCGI requests. It supports applications that can handle concurrent requests using event driven or multi-threaded methods.

A quote from the fastcgi specification , which mentioned a connection that can handle multiple requests and can be implemented in mutli-threaded tech. I am wondering how this connection can be interpreted as a process, and it can generate multiple threads for each request. If this is true, I have become more confused about how to handle a shared resource in each thread?

PS thanks Thomas for inviting me to several messages, but I think the questions are related to each other, and it’s better to group them together.

We thank S. Lott for the great answer, but some answers to each question are too short or not considered at all.

Thank you all for your reply, which brings me closer to the truth.

A detailed answer will be greatly appreciated!

+44
webserver cpu-usage
Dec 28 '09 at 13:47
source share
6 answers

1. Depends on the web server (and sometimes on the configuration). Description of various models:

  • Apache with mpm_prefork (default on unix): processed per request. To minimize startup time, Apache maintains a pool of idle processes awaiting processing of new requests (which you are adjusting the size of). When a new request arrives, the master process delegates it to the available worker, otherwise a new one is created. If you had 100 requests, if you did not have 100 workers idling, you need to do some markup to handle the load. If the number of unoccupied processes exceeds the value of MaxSpare, some of them will be received after completion of the requests until there are enough of the total number of unoccupied processes.

  • Apache with mpm_event, mpm_worker, mpm_winnt: stream per request. Similarly, in most situations, apache supports a pool of idle threads and is also configurable. (A small detail, but functionally the same: mpm_worker starts several processes, each of which is multithreaded).

  • Nginx / Lighttpd: These are lightweight event-based servers that use select () / epoll () / poll () to multiplex multiple sockets without the need for multiple threads or processes. Due to the very careful encoding and use of non-blocking APIs, they can scale up to thousands of simultaneous requests for commercial equipment, providing affordable bandwidth and correctly configured file descriptor restrictions. The bottom line is that the implementation of traditional built-in scripting languages ​​is almost impossible in the server context, this will negatively affect most of the benefits. Both support FastCGI, however, for external scripting languages.

2. Depending on the language or in some languages, which deployment model are you using. Some server configurations allow only specific deployment models.

  • Apache mod_php, mod_perl, mod_python: these modules run a separate interpreter for each apache worker. Most of them cannot work very well with mpm_worker (due to various thread safety issues in the client code), so they are mostly limited to forking models. This means that for each apache process you have a php / perl / python interpreter. This significantly increases the amount of memory: if a given apache-worker usually takes about 4 MB of memory on your system, one with PHP can take 15 MB, and one with Python can take 20-40 MB for an average application. Some of them will be shared memory between processes, but in general these models are very difficult to scale very large.

  • Apache (supported configurations), Lighttpd, CGI: this is basically a method of dying hosting. The problem with CGI is that you not only deploy a new request processing process, you do it for -all-request, and not only when you need to increase the load. Since dynamic languages ​​today have a rather long startup time, this not only creates a lot of work for your web server, but also significantly increases the page load time. A small perl script may work just fine like CGI, but a large python, ruby ​​or java application is rather cumbersome. In the case of Java, you can expect a second or more just to run the application, only to repeat it again on the next request.

  • All web servers, FastCGI / SCGI / AJP: this is the "external" model of hosting dynamic languages. There is a whole list of interesting options, but the bottom line is that your application listens on a socket and the web server processes the HTTP request and then sends it through a different protocol to the socket only for dynamic pages (static pages are usually processed directly by the web server).

    This has many advantages because you will need less dynamic workers than you need the ability to handle connections. If for every 100 requests, half is for static files such as images, CSS, etc., And besides, if most dynamic requests are short, you can get 20 dynamic workers that will process 100 concurrent clients. That is, since the normal use of a supported connection to the server is 80% inactive, your dynamic interpreters can process requests from other clients. This is much better than the mod_php / python / perl approach, where when the user loads a CSS file or doesn't load anything at all, your interpreter sits there using memory and does nothing.

  • Apache mod_wsgi: This applies in particular to python hosting, but it takes advantage of several applications hosted on a web server (easy setup) and external hosting (process multiplexing). When you run it in daemon mode, mod_wsgi only delegates requests to your working daemons when necessary, and thus 4 daemons can handle 100 concurrent users (depending on your site and its workload).

  • Phusion Passenger: Passenger is an apache hosting system that is primarily designed to host ruby ​​applications, and, like mod_wsgi, provides the benefits of both an external and a managed web server.

3. Again, I will analyze the issue based on hosting models, where applicable.

  • mod_php, mod_python, mod_perl: Only the common C libraries of your application will usually be shared between Apache workers. This is because apache first starts and then loads the dynamic code (which, due to the subtleties, basically cannot use shared pages). Translators do not communicate with each other within this model. Global variables are usually not shared. In the case of mod_python, you can have global variables between requests in a process, but not through processes. This can lead to some very strange behavior (browsers rarely support the same connection forever, and most open several for this website), so be very careful how you use global variables. Use something like memcached or a database or files for things like session storage and other cache bits to use.

  • FastCGI / SCGI / AJP / Proxied HTTP: since your application is essentially a server in itself, it depends on the language the server is written in (usually it is the same language as your code, but not always) and various factors. For example, most Java deployments use a stream per request. Python and its "flup" FastCGI library can work both in pre-sale and in streaming mode, but since Python and its GIL are limited, you are likely to get the best performance from prefork.

  • mod_wsgi / passenger: mod_wsgi in server mode you can configure how it processes things, but I would recommend you give it a fixed number of processes. You want to keep your python code in memory, deploy and get ready for work. This is the best approach for predictability and low latency.

In almost all of the models mentioned above, the process / thread lifetime is more than one request. Most of the settings follow some options in the apache model: Keep some spare workers around, if necessary, create more when they can be used, when there are too many of them, based on several customizable restrictions. Most of these settings do not kill the process after the request, although some can clear the application code (for example, in the case of PHP fastcgi).

4. If you say that “the web server can only process 100 requests”, it depends on whether you mean the web server itself or the dynamic part of the web server. There is also a difference between actual and functional limits.

In the case of Apache, for example, you configure the maximum number of workers (connections). If this number of connections was 100 and was reached, no connections will be accepted by apache until someone disconnects. With keep-alive enabled, these 100 connections can remain open for a long time, much longer than one request, and the remaining 900 people waiting for requests are likely to be absent.

If you have high enough limits, you can accept all these users. However, even with the lightest Apache, the cost is about 2-3 MB per worker, so only with apache can you talk about 3gb + memory only for processing connections, not to mention other possibly limited OS resources, such as process identifiers, descriptors files, and buffers, and this is before reviewing your application code.

For lighttpd / nginx, they can handle a large number of connections (thousands) in a tiny amount of memory, often just a few megabytes per thousand connections (depends on factors such as buffers and how async IO apis is configured). If we assume that most of your connections remain alive and 80% (or more) are inactive, this is very good, since you do not spend time on dynamic time or most of the memory.

In any external hosted model (mod_wsgi / fastcgi / ajp / proxied http), let's say you only have 10 employees and 1000 users make a request, your web server will post requests to your dynamic employees. This is ideal: if your queries return quickly, you can continue to work with a much greater user load without requiring more work. Typically, a premium is a memory or database connection, and in order of priority, you can serve many more users with the same resources, instead of denying some users.

Be careful: say that you have one page that builds a report or searches and takes a few seconds, and many users associate workers with this: someone who wants to load your first page can be queued for a few seconds while all these lengthy requests are completed. Alternatives use a separate pool of workers to process the URLs in your section of the reporting application or for a separate publication (for example, in a background job), and then to subsequently poll its completion. There are many options, but you need to think about your application.

5 .. Most people using apache, which need to handle many concurrent users, have disconnected due to the large amount of memory. Or Apache with support enabled, with a short standby time of, say, 10 seconds (so you can get your first page and images / CSS in one page load). If you really need to scale to 1000 connections or more, and you want to continue to live, you'll want to see Nginx / lighttpd and other lightweight event-based servers.

It can be noted that if you want to use apache (for the convenience of using the configuration or for placing certain settings), you can put Nginx in front of apache using an HTTP proxy. This will allow Nginx to handle keep-alive links (and preferably static files) and apache to handle only grunt work. Interestingly, Nginx is better than apache when writing log files. For production deployment, we were very pleased with nginx before apache (with mod_wsgi in this case). Apache does not perform access registration and does not process static files, which allows us to disable a large number of modules inside apache in order to save its small area.

I already answered this already, but no, if you have a long connection, it should not have anything to do with how long the interpreter works (while you are using an external hosted application, which by now should be much better). Therefore, if you want to use a comet and stay alive for a long time (which is usually good if you can handle it), consider nginx.

Bonus question FastCGI . You mentioned that fastcgi can be multiplexed in one connection. This is really supported by the protocol (I believe the concept is known as “channels”), so theoretically one socket can handle many connections. However, this is not a required feature of fastcgi developers, and in fact I do not believe that there is one server that uses this. Most fastcgi respondents do not use this feature because the implementation of this is very complicated. Most web servers will only make one request through a given fastcgi socket at a time, and then do the next one through that socket. This way, you often only have one fastcgi socket per process / thread.

Whether your fastcgi application is used for processing or streaming (and whether you implement it through the "main" process that accepts connections, delegation, or just a lot of processes, each of which does its job) is up to you; and depends on the capabilities of your programming language and OS. In most cases, everything that is used by default by the library should be fine, but be prepared for some benchmarking and setting parameters.

As for the general state, I recommend that you pretend that there are no traditional applications of the collaborative process in the process: even if they can work now, you may need to share your dynamic workers on several computers later. For states such as shopping trolleys, etc .; db may be the best option, session login information can be stored in protected files, and for a temporary state, something similar to memcached is pretty neat. The less you rely on functions that exchange data (the common-nothing approach), the more you can scale in the future.

Postscript : I wrote and deployed many dynamic applications in the entire settings area above: all the web servers listed above and everything in the PHP / Python / Ruby / Java range. I intensively tested (using both benchmarking and real-world observation) methods, and the results are sometimes surprising: less often more. After you have moved away from hosting your code in the web server process, you can often leave with a very small number of FastCGI / Mongrel / mod_wsgi / etc employees. It depends on how long your application stays in the database, but very often it happens that more processes than 2 * the number of processors will actually bring you nothing.

+43
Jan 06
source share

How does my web server handle such 100 concurrent requests? Does the web server create one process / thread for each request? (if so, process or thread?)

This is changing. Apache has both threads and processes for processing requests. Apache runs several parallel processes, each of which can run any number of simultaneous threads. You must configure Apache to control how this happens on every request.

How does the backend interpreter work? How does it process the request and generate the correct html? Will the interpreter generate a process / thread for each request? (If so, process or thread?)

It depends on the configuration of your Apache and your language. For Python, one typical approach is to have daemon processes running in the background. Each Apache process has a daemon process. This is done with the mod_wsgi module. It can be configured in several ways.

If the interpreter will generate a process / thread for each request, what about these processes (threads)? Do they share some code space? Will they communicate with each other? How to handle global variables in internal codes? Or are they independent processes (threads)? How long does the process / thread last? Will they be destroyed when processing the request and the response will be returned?

Themes have the same code. A-priory.

Processes will use the same code because Apache works.

They are not intentional - they communicate with each other. Your code has no way to easily determine what else is going on. This is by design. You cannot determine what process you are working in, and you cannot determine which other threads are running in this process space.

The processes are lengthy. They are not created (and should not) be dynamically created. You configure Apache to unlock multiple parallel copies of itself when it begins to avoid the overhead of creating a process.

Creating a theme has much less overhead. How Apache handles threads internally doesn't really matter. However, you might think of Apache as the start of a stream per request.

Suppose a web server can only support 100 concurrent requests, but now it has received 1000 concurrent requests. How does he deal with this situation? Will they process them as a queue and process the request when the server is available? Or other approaches?

This is a matter of scalability. In short, how productivity will deteriorate as the load increases. The general answer is that the server is getting slower. For a certain load level (say, 100 simultaneous requests), there are enough accessible processes that all of them work very quickly. At some load level (for example, 101 simultaneous requests), it starts to slow down. At a different load level (who knows how many requests) it becomes so slow that you are unhappy with the speed.

There is an internal queue (as part of the work of TCP / IP, as a rule), but there is no regulator that limits the workload to 100 simultaneous requests. If you get more requests, more threads are created (no more processes), and everything happens more slowly.

+19
Dec 28 '09 at 2:02 p.m.
source share

To begin with, the detailed answers to all your questions are few, IMHO.

Anyway, a few short answers to your questions:

<b> # 1

It depends on the server architecture. Apache is a multi-processor and possibly multi-threaded server. There is a master process that listens on the network port and manages the pool of workflows (where in the case of "worker" mpm, each workflow has several threads). When a request arrives, it is sent to one of the idle workers. The wizard manages the size of the working pool, starting and ending the work of workers depending on the load and configuration settings.

Now, lighthttpd and nginx are different; they are the so-called event-based architectures, where multiple network connections are multiplexed onto one or more workflows / threads, using OS support for event multiplexing, such as the classic select () / poll () in POSIX, or more scalable, but unfortunately OS-specific mechanisms such as epoll on Linux. The advantage of this is that each additional network connection requires only a few hundred bytes of memory, allowing these servers to open tens of thousands of connections, which would normally be prohibitive for a request / process / stream architecture such as apache, however, these servers event-based can use multiple processes or threads to use multiple processor cores, as well as to simultaneously execute system lock calls, such as normal input / output of POSIX files.

For more information, see the somewhat outdated C10k page from Dan Kegel .

<b> # 2

Again, it depends. For classic CGI, a new process starts for each request. For mod_php or mod_python with apache, the interpreter is built into apache processes and therefore there is no need to start a new process or thread. However, this also means that each apache process requires quite a lot of memory, and in combination with the problems described above for # 1, scalability is limited.

To avoid this, you can create a separate pool of heavyweight processes on which interpreters work, and a proxy server of third-party web servers for internal servers when you need to create dynamic content. This, in fact, is the approach used by FastCGI and mod_wsgi (although they use their own protocols, not HTTP, so this may not technically be proxy). This is also a typical approach chosen when using event-based servers, since the code is rarely used to generate dynamic content, which is necessary for proper operation in an event-based environment. The same goes for multi-threaded approaches if the dynamic content code is not thread safe; you can have, say, an apache server interface with streaming working mpm proxying to Apache servers on a server running PHP code with a single-threaded mpm prefix.

<b> # 3

Depending on what level you ask, they will use some memory through the OS caching mechanism, yes. But in general, from the point of view of the programmer, they are independent. Please note that this independence in itself is not bad, since it allows straightforward horizontal scaling to multiple machines. But alas, a certain amount of messages is often required. One simple approach is to communicate through a database, assuming it is necessary for other reasons, as is usually the case. Another approach is to use some dedicated distributed memory caching system, such as memcached .

<b> # 4

It depends. They can be queued or the server may respond with some suitable error code, such as HTTP 503, or the server may simply refuse the connection in the first place. As a rule, all this can happen depending on how the server is loaded.

<b> # 5

The viability of this approach depends on the server architecture (see my answer to No. 1). For an event-based server, maintaining open connections is not a big problem, but for apache, this is certainly due to the large amount of memory needed for each connection. And yes, this certainly requires a lengthy interpreter process, but as described above, with the exception of the classic CGI, this is pretty much provided.

+4
Jan 06 '10 at 13:33
source share

Web servers are a multi-threaded environment ; in addition to using application scope variables, a user request does not interact with other threads.

So:

  • Yes, a new thread will be created for each user.
  • Yes, HTML will be processed for each request.
  • You will need variables with the application area
  • If you receive more requests than you can, they will be queued. If they are modified to a specified waiting period, the user will receive a response or "the server is busy" as an error.
  • The comet is not intended for any server / language. You can achieve the same result by querying your server every n seconds without dealing with other unpleasant thread problems.
0
Dec 28 '09 at 13:57
source share

Because isolating a process is something that you do not always have control or knowledge that I have learned so far, that I have to write code that relies on threads and “ contexts ” to store data that will be “classically” stored as static data. But even this may change in the near future with the advent of Continuations .

0
Jan 07 '10 at 23:33
source share

These specific RoR links are related to:

0
Jul 23 '11 at 8:50
source share



All Articles