2004.05.15
Optimizing ppp and Code Quality
The Problem
While debugging a problem of my ppp connection I noticed that ppp was apparently doing a protocol lookup (with a file open, read, close sequence) for every packet it read. This is an excerpt from the strace log, one of my favourite debugging tools.
open("/etc/protocols", O_RDONLY) = 1
fstat(1, {st_mode=S_IFREG|0644, st_size=5752, ...}) = 0
read(1, "#\n# Internet protocols\n#\n# $Free"..., 8192) = 5752
close(1) = 0
The Cause
Wanting to correct this source of inefficiency I started scanning the FreeBSD ppp source code looking for the culprit. To locate the offending getproto* call I put logging statements near all such calls, and a couple of hours later I located the problem. The FilterCheck routine, called on the arrival of every packet, contained the following sequence, close to its beginning.int
FilterCheck(const unsigned char *packet, u_int32_t family,
const struct filter *filter, unsigned *psecs)
{
char [...] prototxt[16];
[...]
if ((pe = getprotobynumber(cproto)) == NULL)
snprintf(prototxt, sizeof prototxt, "%d", cproto);
else
snprintf(prototxt, sizeof prototxt, "%s", pe->p_name);
if (log_IsKept(LogDEBUG)) {
[...]
log_Printf(LogDEBUG, " Filter: proto = %s, %s\n", prototxt, dbuff);
[...]
if (log_IsKept(LogFILTER)) {
snprintf(dstip, sizeof dstip, "%s", ncpaddr_ntoa(&dstaddr));
log_Printf(LogFILTER, "%sbound rule = %d accept %s "
"src = %s:%d dst = %s:%d\n", filter->name, n, prototxt,
ncpaddr_ntoa(&srcaddr), sport, dstip, dport);
[...]
if (log_IsKept(LogFILTER)) {
snprintf(dstip, sizeof dstip, "%s", ncpaddr_ntoa(&dstaddr));
log_Printf(LogFILTER,
"%sbound rule = %d deny %s src = %s/%d dst = %s/%d\n",
filter->name, n, prototxt,
ncpaddr_ntoa(&srcaddr), sport, dstip, dport);
Solution One
The conditions under which prototxt was used were filter logging and debugging options that are typically not enabled in a production environment. So my first try was to guard the getprotobynum call with a similar conditional: if (log_IsKept(LogDEBUG) || log_IsKept(LogFILTER)) {
if ((pe = getprotobynumber(cproto)) == NULL)
snprintf(prototxt, sizeof prototxt, "%d", cproto);
else
snprintf(prototxt, sizeof prototxt, "%s", pe->p_name);
char *prototxt, protobuff[16];
[...]
if (log_IsKept(LogDEBUG) || log_IsKept(LogFILTER)) {
if ((pe = getprotobynumber(cproto)) == NULL)
snprintf(protobuff, sizeof protobuff, "%d", cproto);
else
snprintf(protobuff, sizeof protobuff, "%s", pe->p_name);
prototxt = protobuff;
} else
prototxt = "[Logging conditionals have changed."
"Fix the conditional where this string appears.]";
A Better Approach
If this had been my private code, I would have probably left it at that. However, I was going to commit this change to the FreeBSD source code base, and it was going to be viewed by talented and technically expert fellow committers whose opinion I respect and value. Was this really the best solution I could come up with? I dismissed my first ideas for a more resilient approach as both inelegant and inefficient. The ideas were to either calculate prototxt every time it was needed (inefficient, since it was used multiple times), or initialize it to NULL and calculate it every time it was needed if it was not already initialized (inelegant). I finally realized that a call to a separate function maintaining a cached version of the result could be used to achieve the same goal./*
* Return a text string representing the cproto protocol number.
*
* The purpose of this routine is calculate this result, for
* the many times it is needed in FilterCheck, only on demand
* (i.e. when the corresponding logging functions are invoked).
*
* This optimization saves, over the previous implementation, which
* calculated prototxt at the beginning of FilterCheck, an
* open/read/close system call sequence per packet.
*
* The caching performed here is just a side effect.
*/
static const char *
prototxt(int cproto)
{
static int oproto = 6;
static char protobuff[16] = "tcp";
struct protoent *pe;
if (cproto == oproto)
return protobuff;
if ((pe = getprotobynumber(cproto)) == NULL)
snprintf(protobuff, sizeof protobuff, "%d", cproto);
else
snprintf(protobuff, sizeof protobuff, "%s", pe->p_name);
oproto = cproto;
return (protobuff);
}
Cache Initialization
My only remaining reservation concerned the initialization of the cache. A random packet could have any value in its protocol field, so I could not use a special value like -1 to initialize protobuff on demand. I felt uncomfortable with using the number and name of a known protocol as initialization values, because they created an additional maintenance dependency on the file where those values were stored (/etc/protocols). Although a change of the TCP protocol number is extremely unlikely, I could well imagine another committer coming up with a scenario where this might be possible (e.g. rebuilding the internet after a worm-caused complete meltdown). Some more thinking drove me to the solution: use a number not likely to be associated with a protocol (-1) and initialize protobuff with its string representation. static int oproto = -1;
static char protobuff[16] = "-1";
Committing
I was now ready to commit. Before committing the change I tested it with filter logging turned on and off and also carefully measured the performance difference in a way that could be repeated by others. Again, I do not think I would have bothered with detailed testing and measuring, if this was my private code. This is the commit message:dds 2004/05/13 02:03:00 PDT FreeBSD src repository Modified files: usr.sbin/ppp ip.c Log: Make getprotobynumber() calls in FilterCheck conditional on the log levels by which they are used. On a typical production setting (no debug or filter logging) this will save an open/read/close system call sequence per packet, approximately halving the system overhead and reducing the overall overhead by 38%. dd bs=1k count=512 if=/usr/share/dict/web2 | ssh ppp-linked-host dd of=/dev/null # time original-ppp -nat -foreground connection Working in foreground mode Using interface: tun0 2.822u 2.404s 2:00.31 4.3% 392+496k 8+18io 3pf+0w # time new-ppp -nat -foreground connection Working in foreground mode Using interface: tun0 2.082u 1.173s 1:26.06 3.7% 379+450k 0+18io 0pf+0w MFC after: 3 weeks Revision Changes Path 1.101 +43 -13 src/usr.sbin/ppp/ip.c