[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Spipe doesn't close when (established) connection times out (due to sleeping, network interruption, etc)



I am trying to use spiped for triggering updates on git repositories,
so the clients will pull the changes when they occur on the server. I
own/control the server and clients, and they are running linux (server
has an arm processor). I have posted some of the code for this at
https://gist.github.com/4107319 .

The server starts a spiped daemon that redirects an outward facing
port (encrypted) to an inward facing port (decrypted), and then it
uses socat to spawn a program when a client connects. The program
(tail -f -n 1 /srv/vcvm/ctl/repo-updates) sends the path of the
changed repo to the connected client when it is changed (since when
updated, repos append their path to that file).

The client connects to the server notification daemon by using spipe,
and then pipes the output of spipe to the monitor-sync function, which
has an infinite read loop in it. When the server daemon is killed, the
socket is closed and then the client side program closes out properly
(since spipe closes), however when it is disconnected (wifi turned
off, network cable unplugged, client put to sleep, etc) for a very
long time (e.g. overnight), and then (the network) is connected again,
the client side daemon does not receive updates anymore, but spipe
remains as if it were still connected. With short disconnections
(tested up to the order of minutes, however I suspect that it may be
up to a few hours) the client receives any updates that occurred
during the disconnection, and the notifications work properly.

The typical use of my netbook will trigger this bug in my code, since
I keep it asleep most of the time, sometimes even for days at a time,
and may not actually shut the netbook down for weeks at a time.

Alternative solutions:
I could use read timeouts in the monitor-sync function, however that
seems a bit kludgy and would possibly be prone to more
polling/overhead.
I could try messing with kernel settings, I think the relevant one
would be net.ipv4.tcp_keepalive_time, however I am unsure if this will
help.
I have modified the spiped code to drop the connection if the outgoing
connection is disconnected instead of if both are disconnected (see
commit https://github.com/brainwater/spiped/commit/fdee65943ad6c2b4a79b685895fe612709ddc184
) and have had mixed results in testing it (I have done little testing
so far, however when I used gdb to put a breakpoint in the
callback_pipestatus function, and then when I disconnected the network
cable, it hit the breakpoint a few hours afterwards, and also when I
was running my modified code it seemed to exit properly, though that
was over the course of a few hours that were hazy due to having a few
drinks (don't drink and debug). When I tried it again, it didn't seem
to work properly. Also, for the modification, I don't really have that
good of an idea of what that part of the code is supposed to do, and
am concerned about improper behavior from the server due to the change
I made.

There is a good chance I will use the timeouts with a connection
check, however I will do more testing with the spiped change that I
made.

If you have any suggestions for me, or further description of the
conditions when the spipe and/or spiped daemon shuts down or triggers
disconnections, they would be appreciated.

--Blake Rainwater