Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
howto:mutex [2012/08/14 07:29]
ormaaj [Choose the locking method]
howto:mutex [2015/08/08 15:22] (current)
bill_thomson [Lock your script (against parallel run)]
Line 1: Line 1:
-====== Lock your script (against parallel ​run) ======+====== Lock your script (against parallel ​execution) ======
  
 {{keywords>​bash shell scripting mutex locking run-control}} {{keywords>​bash shell scripting mutex locking run-control}}
Line 5: Line 5:
 ===== Why lock? ===== ===== Why lock? =====
  
-Sometimes there'​s ​the need to ensure ​that a script is only executed ​one time. Imagine ​some cronjob ​to do something very important, which will fail or corrupt data if it accidently runs twiceIn these cases, a form of ''​MUTEX''​ (**mutual exclusion**) is needed.+Sometimes there'​s ​need to ensure only one copy of a script runs, i.e prevent two or more copies running simultaneously. Imagine ​an important ​cronjob ​doing something very important, which will fail or corrupt data if two copies of the called program were to run at the same timeTo prevent this, a form of ''​MUTEX''​ (**mutual exclusion**) ​lock is needed.
  
-The basic procedure is simple: The script checks ​out if a specific condition (locking) is present at startup, if yes, it's locked - it doesn'​t start.+The basic procedure is simple: The script checks if a specific condition (locking) is present at startup, if yes, it's locked - the scipt doesn'​t start.
  
-This article describes ​the locking with common UNIX(r) tools. There are various ​other special locking tools outsideof course. ​But they'​re not standardized,​ or better: You can't be sure that they'​re present ​where you want to run your scripts. **Of course, a tool designed for exactly ​this purpose does the job much better than all general code in here.**+This article describes locking with common UNIX(r) tools. There are other special locking tools available, But they'​re not standardized,​ or worse yet, you can't be sure they'​re present ​when you want to run your scripts. **tool designed for specifically for this purpose does the job much better than general ​purpose ​code.**
  
 ==== Other, special locking tools ==== ==== Other, special locking tools ====
  
-As told above, a special tool for locking is the 100% solution. ​You don't have race conditions, ​you don'​t ​need to work around specific limits, and all those issues.+As told above, a special tool for locking is the preferred ​solution. ​Race conditions ​are avoidedas is the need to work around specific limits.
  
   * ''​flock'':​ http://​www.kernel.org/​pub/​software/​utils/​script/​flock/​   * ''​flock'':​ http://​www.kernel.org/​pub/​software/​utils/​script/​flock/​
Line 21: Line 21:
  
 The best way to set a global lock condition is the UNIX(r) filesystem. Variables aren't enough, as each process has its own private variable space, but the filesystem is global to all processes (yes, I know about chroots, namespaces, ... special case). The best way to set a global lock condition is the UNIX(r) filesystem. Variables aren't enough, as each process has its own private variable space, but the filesystem is global to all processes (yes, I know about chroots, namespaces, ... special case).
-You can "​set"​ several things in filesystem that can be used as locking indicator:+You can "​set"​ several things in the filesystem that can be used as locking indicator:
  
   * create files   * create files
Line 28: Line 28:
  
  
-To create a file or set a file timestamp, usually the command touch is used. That implies the following problem: +To create a file or set a file timestamp, usually the command touch is used. The following problem ​is implied
-A locking mechanism ​would check the existance of the lockfile, if it doesn'​t exist, it would create ​one (lock) ​and continueThese are **two steps**! That meansit's **not one atomic operation**. There'​s a small amount of time between checking and creating, where another instance of the same script could perform locking (because when it checked, the lockfile wasn't there)! In that case you would have 2 instances of the script running, both think they succesfully locked, and both think they can operate without ​collisions+A locking mechanism ​checks for the existance of the lockfile, if no lockfile exists, it creates ​one and continuesThose are **two separate ​steps**! That means it's **not an atomic operation**. There'​s a small amount of time between checking and creating, where another instance of the same script could perform locking (because when it checked, the lockfile wasn't there)! In that case you would have 2 instances of the script running, both thinking ​they are succesfully locked, and can operate without ​colliding
-Setting the timestamp ​would be similar: One step to check the timespamp, a second step to set the timestamp.+Setting the timestamp ​is similar: One step to check the timespamp, a second step to set the timestamp.
  
 <WRAP center round tip 60%> <WRAP center round tip 60%>
Line 36: Line 36:
 </​WRAP>​ </​WRAP>​
  
-A simple way to get that is to create a **lock directory** - the mkdir command. It will+A simple way to get that is to create a **lock directory** - with the mkdir command. It will:
  
-    * create a given directory only if it did not exist before, and set a successful exit code +    * create a given directory only if it does not exist, and set a successful exit code 
-    * it will set an unsuccesful exit code if an error occours - for example if the given directory already ​existed+    * it will set an unsuccesful exit code if an error occours - for exampleif the directory ​specified ​already ​exists
  
  
-With mkdir it seems, we have our two steps in one simple operation. A (very!) simple locking code might look like this now:+With mkdir it seems, we have our two steps in one simple operation. A (very!) simple locking code might look like this:
 <code bash> <code bash>
 if mkdir /​var/​lock/​mylock;​ then if mkdir /​var/​lock/​mylock;​ then
Line 53: Line 53:
 In case ''​mkdir''​ reports an error, the script will exit at this point - **the MUTEX did its job!** In case ''​mkdir''​ reports an error, the script will exit at this point - **the MUTEX did its job!**
  
-//In case the directory is removed after setting a successful lock while the script is still running, the lock is lost.Doing chmod -w for parent directory containing the lock directory can be done but it is also not atomic.Maybe a while loop checking continously for the existence of the lock in background and sending a signal such as USR1 if the directory is found non-existent ​can be done.The signal would need to be trapped.I am sure there would be a better solution than this suggestion//​ --- //​[[sunny_delhi18@yahoo.com|sn18]] 2009/12/19 08:24//+//If the directory is removed after setting a successful lockwhile the script is still running, the lock is lost. Doing chmod -w for the parent directory containing the lock directory can be donebut it is not atomic. Maybe a while loop checking continously for the existence of the lock in the background and sending a signal such as USR1if the directory is not foundcan be done. The signal would need to be trapped. I am sure there there is a better solution than this suggestion//​ --- //​[[sunny_delhi18@yahoo.com|sn18]] 2009/12/19 08:24//
  
-**Note:​** ​On my way through ​the Internet I found some people ​wondering ​if the ''​mkdir'' ​way will work "on all filesystems"​. Well, let's say it should. The syscall under ''​mkdir''​ is guarenteed to work atomic ​in all cases, at least on Unices. ​A problem can be a shared filesystem on NFS or a real cluster ​filesystemThere it depends on the mount options and the implementation. However, I successfully use this simple ​way on top of an Oracle OCFS2 filesystem in a 4-node cluster environment. So let's just say "it's expected to work under normal conditions"​.+**Note:​** ​While perusing ​the InternetI found some people ​asking ​if the ''​mkdir'' ​method works "on all filesystems"​. Well, let's say it should. The syscall under ''​mkdir''​ is guarenteed to work atomicly ​in all cases, at least on Unices. ​Two examples of problems are NFS filesystems and filesystems on cluster ​serversWith those two scenarios, dependencies exist related to the mount options and implementation. However, I successfully use this simple ​method ​on an Oracle OCFS2 filesystem in a 4-node cluster environment. So let's just say "​it ​should ​work under normal conditions"​.
  
-Another atomic method is setting the ''​noclobber''​ shell option (''​set -C''​), which will cause redirection to fail if the file the redirection points to already exists (using diverse ''​open()''​ methods). This is also a very nice way, and I use this more simple locking method successfully in production, too. Need to write a code example here.+Another atomic method is setting the ''​noclobber''​ shell option (''​set -C''​). That will cause redirection to failif the file the redirection points to already exists (using diverse ''​open()''​ methods). Need to write a code example here.
  
 <code bash> <code bash>
Line 72: Line 72:
 </​code>​ </​code>​
  
-Another explanation of this basic locking ​pattern using ''​set -C''​ can be found [[http://​pubs.opengroup.org/​onlinepubs/​9699919799/​xrat/​V4_xcu_chap02.html#​tag_23_02_07 | here]].+Another explanation of this basic pattern using ''​set -C''​ can be found [[http://​pubs.opengroup.org/​onlinepubs/​9699919799/​xrat/​V4_xcu_chap02.html#​tag_23_02_07 | here]].
 ===== An example ===== ===== An example =====
  
-This code was taken from a script that controls PISG to create statistical pages from my IRC logfiles. It doesn'​t matter for you, I just note that to tell you that this code works and is used+This code was taken from a production grade script that controls PISG to create statistical pages from my IRC logfiles. 
-There are some additional things ​compared to the very simple example above:+There are some differences ​compared to the very simple example above:
  
   * the locking stores the process ID of the locked instance   * the locking stores the process ID of the locked instance
   * if a lock fails, the script tries to find out if the locked instance still is active (unreliable!)   * if a lock fails, the script tries to find out if the locked instance still is active (unreliable!)
-  * trapsto automatically remove the lock when the script terminates or is killed, are created+  * traps are created ​to automatically remove the lock when the script terminatesor is killed
  
  
-I don't show various details - like determinating the signal by which the script ​was killed ​- hereI just show the most relevant code:+Details on how the script ​is killed ​aren't givenonly code relevant to the locking process is shown:
 <code bash> <code bash>
 #!/bin/bash #!/bin/bash
Line 91: Line 91:
 PIDFILE="​${LOCKDIR}/​PID"​ PIDFILE="​${LOCKDIR}/​PID"​
  
-# exit codes and text for them - additional features nobody needs :-)+# exit codes and text
 ENO_SUCCESS=0;​ ETXT[0]="​ENO_SUCCESS"​ ENO_SUCCESS=0;​ ETXT[0]="​ENO_SUCCESS"​
 ENO_GENERAL=1;​ ETXT[1]="​ENO_GENERAL"​ ENO_GENERAL=1;​ ETXT[1]="​ENO_GENERAL"​
Line 112: Line 112:
           rm -rf "​${LOCKDIR}"'​ 0           rm -rf "​${LOCKDIR}"'​ 0
     echo "​$$"​ >"​${PIDFILE}" ​     echo "​$$"​ >"​${PIDFILE}" ​
-    # the following handler will exit the script ​on receiving these signals+    # the following handler will exit the script ​upon receiving these signals
     # the trap on "​0"​ (EXIT) from above will be triggered by this trap's "​exit"​ command!     # the trap on "​0"​ (EXIT) from above will be triggered by this trap's "​exit"​ command!
     trap 'echo "​[statsgen] Killed by a signal."​ >&2     trap 'echo "​[statsgen] Killed by a signal."​ >&2
Line 120: Line 120:
 else else
  
-    # lock failed, ​now check if the other PID is alive+    # lock failed, check if the other PID is alive
     OTHERPID="​$(cat "​${PIDFILE}"​)"​     OTHERPID="​$(cat "​${PIDFILE}"​)"​
  
-    # if cat wasn't able to read the file anymore, another instance ​probably ​is+    # if cat isn't able to read the file, another instance is probably
     # about to remove the lock -- exit, we're *still* locked     # about to remove the lock -- exit, we're *still* locked
-    #  Thanks to Grzegorz Wierzowiecki for pointing this race condition ​out on+    #  Thanks to Grzegorz Wierzowiecki for pointing ​out this race condition on
     #  http://​wiki.grzegorz.wierzowiecki.pl/​code:​mutex-in-bash     #  http://​wiki.grzegorz.wierzowiecki.pl/​code:​mutex-in-bash
     if [ $? != 0 ]; then     if [ $? != 0 ]; then
  • howto/mutex.1344929356.txt
  • Last modified: 2012/08/14 07:29
  • by ormaaj