[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [JDEV] Minor 'bug'let ?



Jeremie Miller wrote:

> > Secondly, given that the active_fd_set is (re)build each time; it
> > is worthwhile to keep a maxfd; rather than FD_SETSIZE; at least
> > on FreeBSD that makes a speed difference.
>
> I'm no god on this select stuff, so could you explain what you mean?  I'm
> not sure what the FD_SETSIZE really does I guess.

I'll give it a go and send you a diff; all it does is that when the (largish)
bit array
is sequentially walked trhough, from bit 0 to bit n-1 it stops at this max. On
a
desktop machine; FD_SETSIZE is typically so low (256 or so) that you do not
incur that much of an overhead; but on a server; it can be set (usually at
kernel
level) quite high. We use values upward of 64000 for certain applications. As
most (BSDish) kernels seem to epect low values; we find that keeping the value

low is worth the expense of keeping a max performance wise.

> > Thirdly setting TCP_NODELAY and alllowing for port REUSE might be
> > nice.
> >         if( (setsockopt(c->id,SOL_SOCKET,SO_REUSEADDR,(const char
> > *)&one,sizeof(one))) <0)
> This one is already in the socket.c file.

Hmm, must have missed that. Sorry.

> >         if( (setsockopt(c->id,IPPROTO_TCP,TCP_NODELAY,(const void
> > *)&one,sizeof(one))) <0)
>
> What does TCP_NODELAY exactly do?  I've read up on some of this stuff, but
> it's not always clear.

Well basically it is a _hint_ to the kernel to send things right away. I.e not
wait for any more
data so it can fill a MTU/MSS nicely. The neat way of using it is with near
atomic writes, for example using an iovec if you have to assemble a parcel of
data from multiple places. But
as soon as you have done the write/send you hint to the kernel that it is OK
to send it.

I believe the origin lies with the first telnet applications; where they
wanted indivudual key
strokes to go across as soon as possible; rather than waiting for 1500 of them
or so to
be concatenated before sending them.

It is a hint, nothing more, an may be ignored; as you might notice with the
delay at the beginning (I tried using the protocol for something called PLOP
which needs very fast status updates over long haul/round-trip-time links).
This is issue you run in to, with BSD stacks, in that you might fall victum of
having two segments  at the start of the connection if your packet is between
101 and 208 bytes, and hence you get a slow start (i,e. a full RTT  timeout
extra between the first and second packet). But that is arguably a kernel bug
or engineering compromise.

Forgive me for nattering.

Dw.