I've some test software that is running on a single computer. As time goes by the software become more reliable. However to debug some of the errors I need to keep the server running and when it fails I need it to power off and power on to restart testing.
This is much like the situation of a router that hangs and you loose connection to the internet. While this last situation can be addressed with a time switch set to power off the router for 1 minute every 24 hours the testing of server software might see the server run for several days or weeks before it hangs.
To fix this what I did was use an ESP8266 connected to a 433mHz transmitter controlling a remote control power socket.
Then a short script run every 10 minutes would test the service was up and if it had failed the script would cycle the power.
The first few lines show some command that can be used for testing such as calling a web service with wget or ICMP checks with ping.
The test command is issued in this case wget and then we test the return code. If there is no service we restart the server.
This code used with the ping can be used for detection all kinds of failure from DNS, Wifi, Broadband / Internet etc it can also be used in some home automation setups.
# Use this to probe the web remote control interface to check Kodi is running?
# wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143/
# ping -c 7 192.168.0.143
# wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
if [ $? -ne 0 ]; then
echo $(date '+%d/%m/%Y %H:%M:%S')" wget to Loft RP Transmission failed" >> /root/LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Power off Loft RP" >> /root/LoftRestart.log
wget -O/dev/null http://192.168.0.237/LOFTOFF >> LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 45s" >> LoftRestart.log
sleep 45
echo $(date '+%d/%m/%Y %H:%M:%S')" Power on Loft RP" >> /root/LoftRestart.log
wget -O/dev/null http://192.168.0.237/LOFTON >> LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 240s.." >> /root/LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 240s.."
sleep 240
echo $(date '+%d/%m/%Y %H:%M:%S')" Retrying Loft Transmission.." >> /root/LoftRestart.log
wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
if [ $? -ne 0 ]; then
echo $(date '+%d/%m/%Y %H:%M:%S')" RP not restarted???" >> /root/LoftRestart.log
else
echo $(date '+%d/%m/%Y %H:%M:%S')" Restart worked ok." >> /root/LoftRestart.log
fi
else
echo $(date '+%d/%m/%Y %H:%M:%S')"-RP in Loft Transmission Web Service running..." >> /root/LoftRestart.log
fi
This is much like the situation of a router that hangs and you loose connection to the internet. While this last situation can be addressed with a time switch set to power off the router for 1 minute every 24 hours the testing of server software might see the server run for several days or weeks before it hangs.
To fix this what I did was use an ESP8266 connected to a 433mHz transmitter controlling a remote control power socket.
Then a short script run every 10 minutes would test the service was up and if it had failed the script would cycle the power.
The first few lines show some command that can be used for testing such as calling a web service with wget or ICMP checks with ping.
The test command is issued in this case wget and then we test the return code. If there is no service we restart the server.
This code used with the ping can be used for detection all kinds of failure from DNS, Wifi, Broadband / Internet etc it can also be used in some home automation setups.
# Use this to probe the web remote control interface to check Kodi is running?
# wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143/
# ping -c 7 192.168.0.143
# wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
if [ $? -ne 0 ]; then
echo $(date '+%d/%m/%Y %H:%M:%S')" wget to Loft RP Transmission failed" >> /root/LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Power off Loft RP" >> /root/LoftRestart.log
wget -O/dev/null http://192.168.0.237/LOFTOFF >> LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 45s" >> LoftRestart.log
sleep 45
echo $(date '+%d/%m/%Y %H:%M:%S')" Power on Loft RP" >> /root/LoftRestart.log
wget -O/dev/null http://192.168.0.237/LOFTON >> LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 240s.." >> /root/LoftRestart.log
echo $(date '+%d/%m/%Y %H:%M:%S')" Sleep 240s.."
sleep 240
echo $(date '+%d/%m/%Y %H:%M:%S')" Retrying Loft Transmission.." >> /root/LoftRestart.log
wget --tries=4 --timeout=7 -O/dev/null http://192.168.0.143:9091
if [ $? -ne 0 ]; then
echo $(date '+%d/%m/%Y %H:%M:%S')" RP not restarted???" >> /root/LoftRestart.log
else
echo $(date '+%d/%m/%Y %H:%M:%S')" Restart worked ok." >> /root/LoftRestart.log
fi
else
echo $(date '+%d/%m/%Y %H:%M:%S')"-RP in Loft Transmission Web Service running..." >> /root/LoftRestart.log
fi