Process Throttling

Dennis Faas's picture

Yesterday I wrote about a problem concerning the usage of the mail queue on the infopackets web server. To recap: whenever the Infopackets Gazette newsletter is sent out (to the list of readers), some emails are entered into a temporary holding area (called a "mail queue") if they are unable to reach the recipient.

This is a potential problem for the infopackets web server because the mail queue can "overflow" and cause the server to run out of storage spac e.

Recall: The Mail Queue Polling solution

I was able to save the web server from potentially running out of space by modifying the "newsletter mailing program". The program now ensures that there is enough storage space on the web server before proceeding to send out or place any additional newsletters in the queue.

So far, so good.

However, there was another problem: the SendMail daemon kept eating up all the available resources (processes) on the web server because there were too many emails stuck in the mail queue. This would result in an Internal Server error (occasionally) if anyone was attempting to access the infopackets web site as the newsletter was being sent out.

Wait a sec -- what in the world is a daemon? Don't you mean Demon?

No, not a demon. A daemon! As I discussed yesterday, there are some programs that can run automatically on a computer (or web server). This process is referred to as polling. A daemon is a program that runs quietly in the background and processes information on the computer automatically. I suppose you could use the either terminology loosely.

The "newsletter mailing program", which I wrote, contains a mini-program called SendMail. After each newsletter has been processed by the "newsletter mailing program" and is ready to be sent out to a recipient, Sendmail makes an attempt to mail the newsletter. If the email bounces or is undeliverable, it is placed into the mail queue for later processing. All of this combined can lead to an overloaded and over-bloated web server.

Solution: limit the number of processes taken up by the "newsletter mailing program"

Well, that idea was easier said than done -- and for the last month or so, I've been thinking of a way to do it. I finally figured out how to list the number of running processes on the web server and implemented a "quick fix", very similar to the Polling Mail Queue solution I mentioned yesterday.

Here's a sample of what it looks like when I send out the newsletter using the mail program I wrote. Note that the the email addresses have been changed to protect the names of the innocent.

The format of the output below is as follows:

List File Name [ Email# / Total Emails In List ] [ Process count ] = email@address.xxx

list0000 [190 / 14346] [46] = zclxrk@lxfxtxmxfxtnxss.cxm

list0000 [191 / 14346] [47] = zclxrkx@mfgmxtxlfxz.cxm

list0000 [192 / 14346] [48] = zclxxtt@xccxcxmm.nxt

list0000 [193 / 14346] [49] = zclxffx@zxgpxnd.nxt.xx

list0000 [194 / 14346] [49] = zcmxddxc@shxw.cx

list0000 [195 / 14346] [49] = zcmxxrx@zwwxnlxnx.cxm

list0000 [196 / 14346] [48] = zcnflxtx@xxl.cxm

list0000 [197 / 14346] [49] = zcxzxrn1@zxllsxxth.nxt

Process Retry #1...

list0000 [198 / 14346] [51] = zcxnrxy@mxdxlxnns.cxm

Sleep 1..2..3

list0000 [199 / 14346] [39] = zcxnstxntxnx@vxx.nxt

list0000 [200 / 14346] [41] = zcxxk3@cxrxlxnx.rr.cxm

list0000 [201 / 14346] [40] = zcxxk@nxsxnc.nxt

Here you can see when the Process Count reaches above 50, the email program is told to "Sleep 1..2..3". This essentially "puts the breaks on" the web server until the number of processes decreases (as it eventually does). This is also referred to as process throttling (think of it as a throttle used on a motorcycle -- it moves up and down and changes the acceleration of the engine). After the number of processes is less than 50 (safe), the emails begin sending out again.

Going a step further: Super Computing

With this type of setup, I am now I am able to split my subscriber database in two, three, or four separate lists and send out the newsletter in parallel (two or more at the same time) without worrying if the mail server is going to run out of processes. This is also referred to as concurrent processing, or parallel processing (processing of the same job done with "mini jobs" for the same task).

Cool, huh?

Rate this article: 
No votes yet