Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems.. It's 100% free, no registration required.

I have been trying to parallelize the following script, specifically each of the three FOR loop instances, using GNU Parallel but haven't been able to. The 4 commands contained within the FOR loop run in series, each loop taking around 10 minutes. It'd be greatly appreciated if any one could help me set this up in parallel using any tool on Unix-based systems, and save on time.

#!/bin/bash

kar='KAR5'
runList='run2 run3 run4'
mkdir normFunc
for run in $runList
do 
fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear

rm -f *.mat
done
share|improve this question

3 Answers 3

up vote 5 down vote accepted

Why don't you just fork (aka. background) them?

foo () {
    local run=$1
    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
}

for run in $runList; do foo "$run" & done

In case that's not clear, the significant part is here:

for run in $runList; do foo "$run" & done
                                   ^

Causing the function to be executed in a forked shell in the background. That's parallel.

share|improve this answer
    
That worked like a charm. Thank you. Such a simple implementation (Makes me feel so stupid now!). –  Ravnoor S Gill Dec 5 '13 at 21:24
    
In case I had 8 files to run in parallel but only 4 cores, could that be integrated in such a setting or would that require a Job Scheduler? –  Ravnoor S Gill Dec 5 '13 at 21:27
    
It doesn't really matter in this context; it's normal for the system to have more active processes than cores. If you have many short tasks, ideally you would feed a queue serviced by a number or worker threads < the number of cores. I don't know how often that is really done with shell scripting (in which case, they wouldn't be threads, they'd be independent processes) but with relatively few long tasks it would be pointless. The OS scheduler will take care of them. –  goldilocks Dec 5 '13 at 21:50
for stuff in things
do
( something
  with
  stuff ) &
done
wait # for all the something with stuff

Whether it actually works depends on your commands; I'm not familiar with them. The rm *.mat looks a bit prone to conflicts if it runs in parallel...

share|improve this answer
    
This runs perfectly as well. You are right I would have to change rm *.mat to something like rm $run".mat" to get it to work without one process interfering with the other. Thank you. –  Ravnoor S Gill Dec 5 '13 at 21:38
    
@RavnoorSGill Welcome to Stack Exchange! If this answer solved your problem, please mark it as accepted by ticking the check mark next to it. –  Gilles Dec 5 '13 at 23:54
    
+1 for wait, which I forgot. –  goldilocks Dec 6 '13 at 12:13

It seems the fsl jobs are depending on eachother, so the 4 jobs cannot be run in parallel. The runs, however, can be run in parallel.

Make a bash function running a single run and run that function in parallel:

#!/bin/bash

myfunc() {
    run=$1
    kar='KAR5'
    mkdir normFunc
    fsl5.0-flirt -in $kar"deformed.nii.gz" -ref normtemp.nii.gz -omat $run".norm1.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref $kar"deformed.nii.gz" -omat $run".norm2.mat" -bins 256 -cost corratio -searchrx -90 90 -searchry -90 90 -searchrz -90 90 -dof 12 
    fsl5.0-convert_xfm -concat $run".norm1.mat" -omat $run".norm.mat" $run".norm2.mat"
    fsl5.0-flirt -in $run".poststats.nii.gz" -ref normtemp.nii.gz -out $PWD/normFunc/$run".norm.nii.gz" -applyxfm -init $run".norm.mat" -interp trilinear
}

export -f myfunc
parallel myfunc ::: run2 run3 run4

To learn more watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and spend an hour walking through the tutorial http://www.gnu.org/software/parallel/parallel_tutorial.html Your command line will love you for it.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.