Fix for Ethernet Connection Drop on Raspberry Pi

Powered by Drupal
Submitted by Sam Hobbs on

Raspberry_Pi_Ethernet_Port.jpg

The Problem

Today an engineer from BT was fiddling with the junction box outside my house, and my modem dropped connection to my router. At the time, the router did not automatically force reconnect (my fault, I hadn’t configured it to do so). When I noticed what had happened, I reconnected to the modem. So far so good. A couple of hours later, I noticed that two of my three Pi (all of which are connected with ethernet cables) had not reconnected to the router. The one that did reconnect is running Raspbmc (XBMC port to Raspberry Pi); the two that did not are running Apache with some bits on top (a mail server, owncloud, and wordpress for this website!). This is a pain because not only did it take the services offline, but I was unable to SSH to the Pi to correct the problem. Removing and reconnecting the ethernet cables did not work, so in the end I had to pull the power and reboot.

Existing Partial Solutions

I found a thread on the RasPi forums about reconnecting WiFi with a BASH script after a drop. My guess is that Raspbmc includes something similar, but for ethernet, which is why it reconnects and the other two Pi do not:

#!/bin/bash

while true ; do
   if ifconfig wlan0 | grep -q "inet addr:" ; then
      sleep 60
   else
      echo "Network connection down! Attempting reconnection."
      ifup --force wlan0
      sleep 10
   fi
done

The directions on the thread indicated that this script could be run in the background (i.e. sudo network-manager.sh &), and added to the end of /etc/rc.local so that it runs when the system is first booted. Clearly, this solution needed modification to work for ethernet instead of wifi, but I was also concerned that running a script all the time in the background could be an unnecessary drain on the Pi’s resources. One more thing that I wanted the script to do was tell me what had happened during the time that the Pi was reconnected. These factors led me to create a new script for my particular problem.

My Solution: New BASH Script

This section presents and my script, and how to use it.

Saving a copy of the script

If you don’t already have a subfolder inside your user’s home folder for scripts, create one now:

mkdir ~/bin

Now copy this script into your favourite text editor, and save it as network-monitor.sh inside that folder.

#!/bin/bash

LOGFILE=/home/admin/network-monitor.log

if ifconfig eth0 | grep -q "inet addr:" ;
then
        echo "$(date "+%m %d %Y %T") : Ethernet OK" >> $LOGFILE
else
        echo "$(date "+%m %d %Y %T") : Ethernet connection down! Attempting reconnection." >> $LOGFILE
        ifup --force eth0
        OUT=$? #save exit status of last command to decide what to do next
        if [ $OUT -eq 0 ] ; then
                STATE=$(ifconfig eth0 | grep "inet addr:")
                echo "$(date "+%m %d %Y %T") : Network connection reset. Current state is" $STATE >> $LOGFILE
        else
                echo "$(date "+%m %d %Y %T") : Failed to reset ethernet connection" >> $LOGFILE
        fi
fi

i.e.

nano ~/bin/network-monitor.sh

…then copy & paste, save and exit (Ctrl + X, hit yes when prompted to save). Make sure the long lines don’t get truncated when you copy and paste them over, or the script won’t work! Note that the script points to a file that will be used as a log on line 3. My username is “admin”; if yours is “pi” then change that line to LOGFILE=/home/pi/network-monitor.log, or replace “pi” with any other username.

Adding ~/bin to your $PATH variable, and making the script executable

Your $PATH variable is a list of places that the shell looks for executables. We need to add the newly added ~/bin so that you don't have to use the full path to the script when you run it. To do this, open ~/.bashrc and add this line to the end:

PATH=$PATH:~/bin

The shell normally reads ~/.bashrc when you log in, but we can tell it to parse it now using this command to that the changes take effect:

source ~/.bashrc

Now we need to make the script executable, e.g.:

chmod +x ~/bin/network-monitor.sh

Initial test

You should now be able to run the script using sudo (i.e. sudo network-monitor.sh). Check the log file with this command:

less ~/network-monitor.log

You should see a date and time stamp from when you ran the script, followed by “Ethernet OK”. If this is what you see, all is well. Press q to quit “less” and return to the command prompt when you are done.

Automating with cron

Cron is a tool that can run scripts at regular intervals, and is very suited for this kind of thing. Luckily, it’s really easy to get started with. Open the cron configuration file with root privileges:

sudo nano /etc/crontab

Now schedule the script to be run every 5 minutes (or any other interval that you would prefer), by adding this last line to the end of the file (remember to change “admin” if you have a different username):

# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# m h dom mon dow user  command
17 *    * * *   root    cd / && run-parts --report /etc/cron.hourly
25 6    * * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.d$
47 6    * * 7   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.w$
52 6    1 * *   root    test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.m$
#

*/5 * * * * root bash /home/admin/bin/network-monitor.sh

If you want the script to run every 10 minutes (or any other multiple of 1 minute), change the first part of the line, e.g. to “*/10″. It’s important to set the script to run with root because the ifup command requires superuser privileges. Make a cup of tea and come back. Check the log file again, you should see more entries with time stamps that are 5 minutes (or whatever you specified) apart.

Full Test

Now is the time to test your script to see if it will reconnect you when the ethernet connection goes down. I’d recommend that you have physical access to the Pi when you test this for the first time, just in case you have to pull the power and reboot (very unlikely if everything has worked so far). Make sure that your Cron interval isn’t set too high or you’ll have to wait ages (5 minutes is about right for testing). Connect to the Pi via SSH, and issue this command (after you issue it, your SSH session will become unresponsive, but that’s kind of the point!):

sudo ifdown eth0

Wait for at least the amount of time you specified between cron jobs, and then login again. You may actually still be logged in – if you didn’t try and get the Pi to do anything while the ethernet was down, it may not have registered the connection drop. Check the log file. You should see something like this:

11 10 2013 16:50:01 : Ethernet OK
11 10 2013 16:55:02 : Ethernet OK
11 10 2013 17:00:01 : Ethernet OK
11 10 2013 17:05:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:05:01 : Network connection reset. Current state is inet addr:192.168.1.103 Bcast:192.168.1.255 Mask:255.255.255.0
11 10 2013 17:10:01 : Ethernet OK

Success! Your Pi can now successfully reconnect to the router. If the router itself was down for a period of time, then you would see something like this in the logs, with the script attempting to reconnect and failing until the router was back up:

11 10 2013 16:50:01 : Ethernet OK
11 10 2013 16:55:02 : Ethernet OK
11 10 2013 17:00:01 : Ethernet OK
11 10 2013 17:05:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:05:01 : Failed to reset ethernet connection
11 10 2013 17:10:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:10:01 : Failed to reset ethernet connection
11 10 2013 17:15:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:15:01 : Failed to reset ethernet connection
11 10 2013 17:20:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:20:01 : Network connection reset. Current state is inet addr:192.168.1.103 Bcast:192.168.1.255 Mask:255.255.255.0
11 10 2013 17:25:01 : Ethernet OK

Future Improvements

I may do some further work on the script to combine all “Ethernet OK” lines into a single line containing two time stamps, between which the connection was fine, e.g:

11 10 2013 16:35:02 to 11 10 2013 17:00:01 : Ethernet OK
11 10 2013 17:05:01 : Ethernet connection down! Attempting reconnection.
11 10 2013 17:05:01 : Network connection reset. Current state is inet addr:192.168.1.103 Bcast:192.168.1.255 Mask:255.255.255.0

For now, I’m just glad that it works! If you think you can improve the script, please let me know, I’d love to hear your suggestions!

Comments

Hi Sam!

When I try to "sudo /home/andreas/bin/network-monitor.sh", it says that the command can't be found.
Even tried cd inside bin and "sudo network-monitor.sh" but it's still the same.. Any clues?

Thanks

I must have it like this below.. Little strange that the "crunchbang" don't work

*/5 * * * * root bash /home/admin/bin/network-monitor.sh
sudo bash /home/admin/bin/network-monitor.sh

07 16 2014 13:58:01 : Ethernet OK
07 16 2014 13:59:01 : Ethernet OK
07 16 2014 14:00:01 : Ethernet OK
07 16 2014 14:01:01 : Ethernet OK
07 16 2014 14:02:01 : Ethernet OK
(END)

OK, it's payback time. Since I used the script above as a starting point, I felt it only fair that I show what I came up with. I needed a way to check if the network was up very frequently from Python. So this meant that I really needed to suppress the multiple "Ethernet OK" messages. I use the existence of a file, called tc, as a 'true/false' bit that determines if the "Ethernet OK" message should be suppressed. Here is the command in Python that calls the bash shell script. I call it like once every second.

subprocess.check_output(['nm-john.sh'])

Here is the shell script file: nm-john.sh

#!/bin/bash
LOGFILE=/home/pi/msg-log/log1 ## This is where the messages are stored

if ifconfig eth0 | grep -q "inet addr:"
then
if [ -e tc ] ## This checks for the exisitence of a file call 'tc', if not there then leave the script
then
rm tc ## If the file is there, remove it to prevent more "up" messages
echo "$(date "+%m %d %Y %T") : Ethernet OK" >> $LOGFILE
fi
else
touch tc ## generate the file tc to prevent multiple "Ethernet Up" messages
echo "$(date "+%m %d %Y %T") : Ethernet connection down! Attempting reconnection." >> $LOGFILE
ifup --force eth0
OUT=$? #save exit status of last command to decide what to do next
if $OUT -eq 0
then
STATE=$(ifconfig eth0 | grep "inet addr:")
echo "$(date "+%m %d %Y %T") : Network connection reset. Current state is" $STATE >> $LOGFILE
else
echo "$(date "+%m %d %Y %T") : Failed to reset ethernet connection" >> $LOGFILE
fi
fi

I hope this will pay it forward to someone else down the line needing to keep their Ethernet up. - Teratech

Hello,

I tried your script and copied into the .sh file as mentioned. When I tried to execute the script, it gives a error

"syntax error near unexpected token 'fi'",

where this 'fi' is the last "fi" of the script closing the first if statement. Is there a problem with the script? I have made sure the lines are not truncated and everything is as it is from the example. I am not familiar with scripting so not sure what is causing this error.

Thanks.

Hi Jason, I actually haven't run this script in a long time, but I don't think there was a problem with it. Are you sure you copied it exactly? Sam

For me the script always showed the connection as down. It was a translation problem. I manually checked ifconfig and I noticed I had to change inet addr to inet adr in the script (only one d in french).

Steeve

That's a useful comment, thanks :) i hadn't thought about the commands / output being different in different languages! Sam

madhatter

Fri, 04/03/2015 - 12:44

Hi,

I have followed the tutorial and when I run it with the SUDO command it outputs to the log but otherwise there is no entry and when I run PS AXG it is not showing in the running programmes list.

Using it on PI2 which loses network at least every couple of days.

hi Sam,

Future Improvements
I may do some further work on the script to combine all “Ethernet OK” lines into a single line containing two time stamps, between which the connection was fine

have you changed your skript for this future?
if yes, please tell me the code! ;)

best wishes,
Mulch

Hello!

It seems this script became quite relevant as of lately since the Raspberry Pi 2 has been having some network drop issues for some people. I had a lot of trouble trying to make it work correctly (the cron instance always came out with "connection down"), but after I managed to set up a MTA to see the script's output I found out the problem: "ifconfig: command not found". Weird, but then after some googling I learned that cron executes in a clean environment, so even PATH is not defined. At least, that's the case in Raspbian. So after using the full path '/sbin/ifconfig' and '/sbin/ifup' it worked alright.

I figured this might be useful to someone else.

Hi Sam,

This worked perfectly "out of the box"!!

Like Andreas, I can't issue sudo network-monitor.sh without bash after sudo. (works well for my user pi without bash, only with sudo I need bash).

I'm glad I've found your page! Every time the circuit breaker fires (so, both router and RPi2 go down and then up at the same time), my RPi2 fails to get connectivity, forever! This will fix it! :)

Many thanks!!

Cheers,

felgy

Hi, guys.
I've RPi 2 and run Openelec on it.
The RPi 2 is permanently powered by (pay attention) 1A iPhone PSU and in use from time to time.
Faced same issue: Dropped network after some time when the system is idle.

After a couple of days of fight, inet surfing, checking voltage and replacing of power cable,
I found a simple, from my point of view, "solution for dummies".
I installed Transmission.
From this time still connected.
I suppose, Transmission makes something similar to proposed by author and holds an Ethernet alive.

Hi Sam
I used your script. tested it and all worked fine from the instructions. In actual use though I found one issue. Where I live the power goes out every so often. When it comes back up the Pi is quicker to start than the router and your script seems to get stuck. I tried putting in a 3min sleep period, but that didn't seem to work.
What I'm looking for is the additional script that will try a reconnection than go back to line one of the script. I stink at programming so I realize that what I'm asking is probably very noob.

Hi Kevin, Can you expand on what made you think it was "stuck"? The script should just check every 5 minutes (each time it is run by Cron) and write a failure line to the log each time it can't connect. Writing a sleep command into the script after checking the connection wouldn't do anything (apart from make the script take longer to finish) unless it was inside a loop, because the next time a check happens is when cron calls it again anyway, not when the script exits. Sam

After writing to you I rethought about the problem.
I'm using the Pi as a VPN gateway. There is the chance, because the Pi starts faster than the router in starting and than making it's connection to the internet that the VPN deamon fails on start up. Therefore when I try to remote connect, after say a power loss, I can't.
I'll first check to see if I can ssh onto the pi from within the network. if that's successful I'll just add a line to your code to restart the VPN sevice after it re-establishes connection...If I can't ssh in than...?
The idea for the sleep command was to try to make sure that when it was trying to establish a connection I had given enough time for the router to finish it's start up. I have it checking every 10 minutes. Knowing it'll most likely fail on a first attempt I cut the wait period down by 7 minutes.

It was indeed the case that the VPN server also failed at start up after a power loss. I'm assuming for the moment that it's an issue of the Pi starting up quicker than the router.
To solve the problem I added an "If" statement right after your "Then". The statement pings the VPN on 10.8.0.1. If null than it returns "Ethernet and VPN OK" to the Logfile. If something other than null than it writes to the logfile "Ethernet OK - VPN Down", sends a VPN restart command and sends "Restarting VPN" to the Logfile.
Under the case of the Ethernet being down I added the VPN restart command and an appropiate message to the Logfile after Logfile message fot restarting the Ethernet.
In cron, on a prior tweaking adventure, I added a cron job to run every hour to clear the Logfile. I did this figuring that if something was wrong I'd only need to see the immediate slice of time anyway, and since the objective is "Set it and Forget it" I now don't need worry that Logfiles are getting to large.

Add new comment

The content of this field is kept private and will not be shown publicly.

Filtered HTML

  • Web page addresses and email addresses turn into links automatically.
  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.