Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly how precious are those CPU cycles? I mean, really. Can you put a dollar figure on them?

And then contrast that with the dollar figure for consultant / employee / remote hands time to figure out WTF went wrong?

There are numerous systems for managing services: monit is the best known, mon and several proprietary systems also exist. Nagios can tell you if the service is running or not (though it doesn't handle the start/stop logic).

These are small details and extensions on top of the existing SysV init foundation.

Ubuntu's boot time is already down to 8.6 seconds -- a restore from suspend is barely less than that (and restore from disk is considerably longer), though both restores preserve user state. You know, what applications / files you had open, and what was in them when you left off, positions of windows on your desktop. All that jazz. http://www.jamesward.com/2010/09/08/ubuntu-10-10-boots-in-8-...

The socket management is kind of nifty, but doesn't add a whole lot that xinetd didn't already offer (systemd does allow multi-socket services and d-bus-initiated services). I'm not convinced these couldn't be hacked into xinetd while preserving the simplicity and stability of init.

My desktop state (and its preservation) is worth a lot more than fast boot.

Yes. I've heard of inane gratuitous questions. As I said: if you're forcing average users to reboot with any frequency, you're Doing It Wrong.



No, monit doesn't manage services. Monit tries to follow clues you've given it about what's running, it polls them once in a while, and if something appears to be not running (as measured by the instructions you've given it), it runs the one-liner you've given it that should start the thing up again.

Monit does a thing that approximates managing a process, for certain values of "approximates", "managing", and "process". Supervisory process management is one of Linux's absolute weakest points. I cut my teeth on fault-tolerant HA minicomputers, and it pains me to think that 30 years later, we still don't have a way to say "make sure apache is always running. period."

As a great blog pointed out, there is exactly one process that KNOWS when a service has stopped running, and it doesn't need .pid files or polling or anything else to tell it: process 1.

I'm not a systemd advocate - I don't know enough about it, and we're using Ubuntu so I'll end up learning upstart anyway - but read this, it's way more eloquent that I can be:

http://dustin.github.com/2010/02/28/running-processes.html


Fair points. And thanks, by the way, for actually advancing the discussion.

Init can and does manage processes. Somewhat crudely, mostly via the 'respawn' directive. One thing it isn't particularly good at is telling if a process is doing something useful (say, serving out web pages successfully), but it will let you know that it's running. There was a semi-popular hack some years back to run sshd out of init (via respawn) to ensure you always had an SSH daemon on your box (Dustin mentions this). The downside is that while it will ensure sshd is running, it doesn't give you much flexibility over the process (you've got to edit inittab and 'init q' to make changes).

What monit and kin can do, above and beyond process-level monitoring, is check that the service attributes of a process are sane. That a webserver, say, kicks out a 200 OK response rather than a 4## or 5## error, and restart the service if this isn't the case. Checking for correct operation can be more useful than simply verifying a process is running (though going too far overboard in defining "correctness" can also cause problems).

For realtime/HA tools, attacking things on the single-system level is probably the wrong way to roll. You want a load balancer in front of multiple hosts with response detection -- is host A still up or not? Whether or not this ties into mitigation (restart) or alerting (notifications to staff) is another matter.

There are also places other than init you can watch things from. /proc contains within it multitudes, including a lot of interesting/useful process state. Daemons can be written with control/monitoring sockets instrumented directly into themselves. Debuggers, strace, ltrace, dtrace, and systemtap all provide resolution inside a running process/thread. Creating something sane, effective, efficient, and sufficient out of all these tools ... interesting problem.


>Ubuntu's boot time is already down to 8.6 seconds

Well, ubuntu doesnt use sysvinit either.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: