Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems.. It's 100% free, no registration required.

I created the script below which takes the path of a single directory and replaces search string in all the files within that directory. I would like to enhance this script in such a way that it can search and replace the string in multiple directories which are listed in an external input file.

External input file content:

/var/start/system1/dir1
/var/start/system2/dir2
/var/start/system3/dir3
/var/start/system4/dir4

Script with one directory:

filepath="/var/start/system/dir1"
searchstring="test"
replacestring="test01"

i=0; 

for file in $(grep -l -R $searchstring $filepath)
do
  cp $file $file.bak
  sed -e "s/$searchstring/$replacestring/ig" $file > tempfile.tmp
  mv tempfile.tmp $file

  let i++;

  echo "Modified: " $file
done
share|improve this question
    
Your code makes the implicit assumption that none of the paths to be replaced in contain spaces. Is this something that solutions can rely on? –  ToxicFrog May 28 at 3:16
    
Path has not spaces ,example below : /var/start/system1/dir1 /var/start/system1/dir2 /var/start/system1/dir3 –  user68775 May 29 at 1:54

3 Answers 3

With GNU tools

< dir.list xargs -rd '\n' grep -rlZ -- "$searchstring" |
  xargs -r0 sed -i -e "s/$searchstring/$replacestring/ig" --

(Don't forget to quote your variables, leaving a variable unquoted is the split+glob operator)

share|improve this answer
    
I tried below and doesn't seems to be working. am i missing something here ? </home/sap1/scrp/directory2 xargs -rd '\n' grep -rlZ -- "$AP44" | xargs -r0 sed -ie "s/$AP44/$Ap45/ig" –  user68775 May 29 at 1:51
    
@user68775, sorry, it should have been -i -e. -i takes an optional argument and here -ie was taken as e being the argument to -i. –  Stéphane Chazelas May 29 at 5:45

First of all, the tmpfile dance can be avoided by using sed -i with GNU sed or sed -i '' with FreeBSD's (in-place replacement).

grep -R can take multiple paths on the command line, so if you are confident that none of the paths contain spaces, you can replace $(grep -l -Re "$searchstring" "$filepath") with $(grep -l -R "$searchstring" $(cat path_list)).

This will fail if any of the paths contain spaces, tabs, or any globbing character, but so will the original.

A much more robust approach uses find and just applies sed to all of the files, trusting it not to modify files with no matches (here assuming a bash shell):

# Read newline-separated path list into array from file 'path_list'
IFS=$'\n' read -d '' -r -a paths path_list

# Run sed on everything
find "${paths[@]}" \
  -exec sed -i -r -e "s/$searchstring/$replacestring/ig" '{}' ';'

But this doesn't give you any feedback on which files it's modifying.

A lengthier version that does give you the feedback:

# Read newline-separated path list into array from file 'path_list'
IFS=$'\n' read -d '' -r -a paths path_list

grep -l -R "$searchstring" "${paths[@]}" | while IFS= read -r file; do
  sed -r -i -e "s/$searchstring/$replacestring/ig" "$file"
  echo "Modified: $file"
done
share|improve this answer
    
You're not avoiding tmp files with -i - GNU sed creates tmp files with -i when you use it. –  mikeserv May 28 at 4:32
    
What you're avoiding is manually wrangling tmp files yourself. –  ToxicFrog May 29 at 0:25
    
I created "path_list" file with the list of directories(mentioned below) in the same directory where the script is located .When I executed the script, it is halting after the IFS step and is not proceeding further.Am I missing something here? "path_list" file content: /home/sap1/scrp/directory1 /home/sap/scrp1/directory2 /home/sap/scrp2/directory3 Script used: path_list=/home/oracon/scrp cd ${path_list} IFS=$'\n' read -d '' -r -a paths path_list grep -l -R "$AP44" "${paths[@]}" | while IFS= read -r file; do sed -r -i -e "s/$AP44/$AP45/ig" "$file" echo "Modified: $file" done –  user68775 May 29 at 1:46
    
@Toxicfrog - agreed - though I probably don't avoid them, shell depending. I just realized I misspelled your username. Sorry. –  mikeserv May 29 at 2:08
    
@ToxixFeog - I tried below and the script does nothing. am i missing something here ? path_list file: file content: /home/sap1/scrp/directory1 /home/sap/scrp1/directory2 /home/sap/scrp2/directory3 Script used: path_list=/home/oracon/scrp cd ${path_list} IFS=$'\n' read -d '' -r -a paths path_list grep -l -R "$AP44" "${paths[@]}" | while IFS= read -r file; do sed -r -i -e "s/$AP44/$AP45/ig" "$file" echo "Modified: $file" done –  user68775 Jun 1 at 16:18

This is the most portable way I can think to do this, though it still relies on the mostly portable /dev/fd/0 for .dot. Without it though, you could use a single file. In any case, it mostly relies on this shell function I wrote the other day:

_sed_cesc_qt() { 
    sed -n ':n;\|^'"$1"'|!{H;$!{n;bn}};{$l;x;l}' |
    sed -n '\|^'"$1"'|{:n;\|[$]$|!{
            N;s|.\n||;bn};s|||
            \|\([^\\]\)\\\([0-9]\)|{
            s||\1\\0\2|g;}'"
            s|'"'|&"&"&|g;'"s|.*|'&'|p}"

}

First I'll show it work, then I'll explain how. So, I'll create a test file base:

printf 'f=%d
    echo "$f" >./"$f"
    echo "$f" >./"$f\n$f"
    echo "$f" >./"$f\n$f\n$f"
' $(seq 10) | . /dev/fd/0

That creates a bunch of files, each named for the number 1-10 that it contains:

ls -qm 
1, 1?1, 1?1?1, 10, 10?10, 10?10?10, 2, 2?2, 2?2?2, 3, 3?3, 3?3?3, 4, 4?4, 4?4?4, 5, 5?5, 5?5?5, 6, 6?6, 6?6?6, 7,
7?7, 7?7?7, 8, 8?8, 8?8?8, 9, 9?9, 9?9?9

That's a comma-delimited list of the files in my test directory, each ? representing a newline.

cat ./1*

1
1
1
10
10
10

Each file contains only a single number.

Now I'll do the grep replace:

find ././ \! -type d -exec \
        grep -l '[02468]$' \{\} + |
_sed_cesc_qt '\./\./' | 
sed 's|.|\\&|g' |
xargs printf 'f=%b
        sed "/[02468]\\$/s//CHANGED/" <<-SED >"$f"
        $(cat <"$f")
        SED\n' | 
. /dev/fd/0

Now when I...

cat ./1*

1
1
1
1CHANGED
1CHANGED
1CHANGED

All of the [2468] files are similarly CHANGED. It works recursively as well. Ok, so now I'll explain how.

First, I guess, the function:

  1. start at :next label
  2. \|address| argument $1 - a marker
  3. if current line is !not a match {
    • append it to Hold buffer
    • if current line is !not $last line {
    • overwrite current line with next line
    • branch back to :next label
    • }}
  4. else if current line is $last line look at pattern space
  5. else exchange contents of hold and pattern buffers and...
  6. look unequivocally at pattern space

That's the first sed statement - and it's pretty much the meat and potatoes of it. We never print the pattern space at all - we only look at it. This is how POSIX defines the l function:

[2addr] l (The letter ell.) Write the pattern space to standard output in a visually unambiguous form. The characters listed in the Base Definitions volume of IEEE Std 1003.1-2001, Table 5-1, Escape Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\r', '\t', '\v' ) shall be written as the corresponding escape sequence; the '\n' in that table is not applicable. Non-printable characters not in that table shall be written as one three-digit octal number (with a preceding \backslash) for each byte in the character (most significant byte first). Long lines shall be folded, with the point of folding indicated by writing a \backslash followed by a \newline; the length at which folding occurs is unspecified, but should be appropriate for the output device. The end of each line shall be marked with a '$'.

So if I do:

printf '\e%s10\n10\n10' '\' | sed -n 'N;N;l'

I get:

\033\\10\n10\n10$

That's almost perfectly escaped for printf. It needs only an extra zero for the octal and to remove the trailing $ - so the next sed statement cleans it up.

I'm not going to do the same level of detail, but basically the next sed statement:

  1. If line begins with $1 marker...
  2. Pulls in the Next line until the current line ends in $
  3. If it had to do the above, it removes the trailing \backslash and \newline character.
  4. Then it removes the trailing $
  5. finds any \backslashes followed by a number that are not preceded by another \backslash and inserts a zero
  6. Searches out any 'single quotes and double-quotes them
  7. Finally it surrounds the entire string with 'single-quotes

So now, when I do:

printf %s\\n ././1* |
_sed_cesc_qt '\./\./'

I get:

'././1'
'././1\n1'
'././1\n1\n1'
'././10'
'././10\n10'
'././10\n10\n10'

The rest is kind of easy. It depends on the fact that the ././ string will resolve, but it will only occur in find/grep's output at the head of every path name - so it becomes my $1 marker.

I -exec grep from find and specify -l for it to output filenames for those files that contain the regex.

I call the function and get its output.

I then \backslash escape every character in its output for xargs.

And with printf I write a script to the |pipe file - which I .dot source as /dev/fd/0. I define the f variable as its current argument - my pathname - and cat that $f argument to a <<heredocument, which is fed to sed, and sed writes back over the source file.

This may involve temporary files - that depends on your shell. bash and zsh will write out a temporary file for every heredocument - but they clean them up, too. dash, on the other hand, will just write the heredocument to an anonymous |pipe.

The important thing about it though is that the file will have to be fully read before its written over - it's just how heredocuments and command substitution work.

share|improve this answer
    
Welcome to 4k 8-) –  slm Jun 4 at 0:27

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.