I need help with two scripts I'm trying to make as one. There are two different ways to detect if there are issues with a bad NFS mount. One is if there is an issue, doing a df
will hang and the other is the df
works but there is are other issues with the mount which a find (mount name) -type -d
will catch.
I'm trying to combine the scripts to catch both issues to where it runs the find type -d
and if there is an issue, return an error. If the second NFS issue occurs and the find hangs, kill the find command after 2 seconds; run the second part of the script and if the NFS issue is occurring, then return an error. If neither type of NFS issue is occurring then return an OK.
MOUNTS="egrep -v '(^#)' /etc/fstab | grep nfs | awk '{print $2}'"
MOUNT_EXCLUDE=()
if [[ -z "${NFSdir}" ]] ; then
echo "Please define a mount point to be checked"
exit 3
fi
if [[ ! -d "${NFSdir}" ]] ; then
echo "NFS CRITICAL: mount point ${NFSdir} status: stale"
exit 2
fi
cat > "/tmp/.nfs" << EOF
#!/bin/sh
cd \$1 || { exit 2; }
exit 0;
EOF
chmod +x /tmp/.nfs
for i in ${NFSdir}; do
CHECK="ps -ef | grep "/tmp/.nfs $i" | grep -v grep | wc -l"
if [ $CHECK -gt 0 ]; then
echo "NFS CRITICAL : Stale NFS mount point $i"
exit $STATE_CRITICAL;
else
echo "NFS OK : NFS mount point $i status: healthy"
exit $STATE_OK;
fi开发者_如何学JAVA
done
The MOUNTS and MOUNT_EXCLUDE lines are immaterial to this script as shown.
You've not clearly identified where ${NFSdir}
is being set.
The first part of the script assumes ${NFSdir}
contains a single directory value; the second part (the loop) assumes it may contain several values. Maybe this doesn't matter since the loop unconditionally exits the script on the first iteration, but it isn't the clear, clean way to write it.
You create the script /tmp/.nfs
but:
- You don't execute it.
- You don't delete it.
- You don't allow for multiple concurrent executions of this script by making a per-process file name (such as
/tmp/.nfs.$$
). - It is not clear why you hide the script in the
/tmp
directory with the.
prefix to the name. It probably isn't a good idea.
Use:
tmpcmd=${TMPDIR:-/tmp}/nfs.$$
trap "rm -f $tmpcmd; exit 1" 0 1 2 3 13 15
...rest of script - modified to use the generated script...
rm -f $tmpcmd
trap 0
This gives you the maximum chance of cleaning up the temporary script.
There is no df
left in the script, whereas the question implies there should be one. You should also look into the timeout
command (though commands hung because NFS is not responding are generally very difficult to kill).
精彩评论