Wednesday, February 18, 2009

reconnect

Several days ago there was a bug reported that our XML-RPC client over SSL will compliant that "no data is available" when trying to read the HTTP header from the server. Since it is a P2-critical bug, I spent a lot of time debugging it immediately.

GDB shows me that "BIO_read()" returned zero when reading the HTTP header, and its manual said that:
A 0 or -1 return is not necessarily an indication of an error. In par-
ticular when the source/sink is non-blocking or of a certain type it
may merely be an indication that no data is currently available and
that the application should retry the operation later.

But finally I can see that when we switching to HTTP from HTTPS, the client works just fine. And later, we are confirmed that the SSL handling in the server side has a race condition and the connection would be shutdown by then.

In order to make the client more robust, I worked our a patch to do reconnection. The result is rather exciting -- I did thousands of XML-RPC calls over a persistent connection, and everything works perfectly.

At last, I am glad to share that, in section 6.3 of "UNIX Network Programming Volume 1", the author described the conditions that a descriptor being select() is ready for read:

b. The read half of the connection is closed (i.e., a TCP connection that has received a FIN). A read operation on the socket will not block and will return 0 (i.e., EOF).

Actually our underlying socket is handled by select() , and I checked that BIO_get_close() really returns 1 when the connection is closed by the server.

Labels: ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home