guest - flak
Flaakle

rough idling

Computers are kind of fast these days, which means a lot of little inefficiencies vanish from sight. In theory an idle system with nothing to do should be doing nothing, but frequently enough it turns out to be doing nothing. A couple examples.

In the early days of smtpd, the master process would wake up once each second to essentially say “Greetings, friend!” to each of its subprocesses, even if nothing had happened. This was holdover from early in development, but since it wasn’t burning an extravagent amount of CPU, it went unnoticed for some time.

More recently, just before 5.8, sndiod was fixed to use a variable, possibly infinite timeout. Previously, it would sleep a few milliseconds, wake up and scan an empty list of clients, then go back to sleep. But it was already calling poll to watch for changes. It wasn’t possible for anything to have changed during the timeout interval.

Just after 5.8, lots of efficiency fixes were added to tmux. The status bar on the bottom refreshes itself periodically with the names of running commands, the time, etc. However, the server process does this even if no clients are attached to see the changes. And some of the refresh operations are surprisingly expensive.

I (and a few others) happened to notice these watching top after a week of uptime. Even though each process was recorded as using 0% CPU and top reported 100% idle, over the course of a week enough time accumulated to observe the problem. In my case, I was monitoring CPU usage on my ERL when I observed sndiod showing up near the top. 0% usage, but many, many seconds of runtime. On a device that doesn’t even have sound. What was it doing? ktrace was very useful in determining what was happening and confirming that these otherwise idle seeming processes were actually quite busy.

One process left: pflogd idles kind of roughly because it uses the bpf timeout mechanism to accumulate lots of packets in each request, but if there aren’t any packets it still gets spurious wakeups. Maybe somebody will fix that, too.

Sometimes several small inefficiencies combine. Firefox (or perhaps gtk) will call XGetImage whenever the cursor moves over its window. The X server in turn attempts to efficiently size a buffer by repeatedly calling realloc to grow and shrink it. Until libc realloc was fixed this resulted in many repeated mmap/munmap calls. Tons of syscalls, page faults, page zeroing, the whole works. Hard to exactly pinpoint the blame here. It was definitely a bug in realloc, but it would have been less of an issue if various upper layers didn’t assume the operations they were performing were free. Everything has a cost. For more behind the scenes fun, try browser ktrace browsing.

And of course, I’m guilty of creating unnecessarily expensive programs, too. Running dwm, I have a small script to update the status bar with the current time, battery status, etc.

function acstat {
        if [ X`apm -a` = X1 ] ; then echo X ; else echo O ; fi
}
function printstat {
        echo `acstat` `sysctl -n hw.cpuspeed` "batt:" `apm -l` " " `date +"%a %b %d %I:%M"`
}
while : ; do
        xsetroot -name "`printstat`"
        sleep 2
done

In hindsight, it’s obvious that running apm twice in the same script to save a bit of parsing is inefficient. But the -a and -l option are there precisely to allow for easy scripting! And apm runs so fast. It prints its output and exits before you can even blink. I hadn’t considered the full impact of this tiny script until, searching for more rough idlers, I ran ktrace -i -d -p 1 for a minute (then ktrace -C to stop) and looked at the output. Very little happening, except for gobs and gobs of apm starting up, talking to apmd, over and over. Even just running sleep makes hundreds of syscalls per exec.

It’s been a bit more work, but I’ve replaced the script above with a single long running program, cutting the number of syscalls per update down tremendously. Minus some boilerplate, now it look like:

local O_RDONLY = 0

local fd = ffi.C.open("/dev/apm", O_RDONLY)

local APM_IOC_GETPOWER = nx._IOR('A', 3, ffi.sizeof("struct apm_power_info"))
local info = ffi.new("struct apm_power_info[1]")

local disp = x11.OpenDisplay(nil)
local scr = x11.DefaultScreen(disp)
local rwin = x11.RootWindow(disp, scr)

while true do
        nx.ioctl(fd, APM_IOC_GETPOWER, info)
        local ac = info[0].ac_state == 1 and "X" or "O"
        local batt = info[0].battery_life
        local perf = info[0].cpuspeed
        local dt = os.date("%a %b %d %I:%M")
        local msg = string.format("%s %d batt: %d   %s", ac, perf, batt, dt)
        x11.StoreName(disp, rwin, msg)
        x11.Flush(disp)
        nx.msleep(2000)
end

ktrace says only six syscalls per loop. ioctl, gettimeofday, poll, writev, recvmsg, nanosleep.

For another investigation entirely, see Power Wastage On An Idle Laptop.

Posted 2015-09-21 16:48:21 by tedu Updated: 2016-09-17 16:38:39
Tagged: openbsd programming software