Using sed to process passwd file

Question

I am trying to learn sed, but I'm having a lot of trouble. What I am trying to do is process my passwd file using a bash script with sed commands to do the following: For every user with group ID 20000, replace their GID in the file with 2000x, where x is the order of the first letter in the user's username (ie: a is 1, b is 2, etc.) Also, for every user whose default shell is bash, change their group to bash, and for those with shell tcsh, change their group to tcshgroup. I have done the above work in awk (much easier to work with I've found), but I don't even know where to start with sed. Any help is much appreciated.

Here is part of the passwd file:

speech-dispatcher:x:108:29:Speech Dispatcher,,,:/var/run/speech-dispatcher:/bin/sh
colord:x:109:117:colord colour management daemon,,,:/var/lib/colord:/bin/false
lightdm:x:110:118:Light Display Manager:/var/lib/lightdm:/bin/false
avahi:x:111:120:Avahi mDNS daemon,,,:/var/run/avahi-daemon:/bin/false
hplip:x:112:7:HPLIP system user,,,:/var/run/hplip:/bin/false
pulse:x:113:121:PulseAudio daemon,,,:/var/run/pulse:/bin/false
saned:x:114:20000::/home/saned:/bin/tcsh
mmccormick:x:1000:20000:owner,,,:/home/mmccormick:/bin/bash

Ideally I'd pick out field 4 of each line to get the group ID and field 7 for the shell, but again, I'm not aware of a way to do that in sed. Thanks in advance.

@sputnick I'm not asking somebody to do this for me, but a nudge in the right direction would help. I did fine figuring out how to work with awk, considering how terrible my teacher is, but with sed I'm just totally lost. — Matt, Mar 25 '13 at 22:06
@Matt awk is a much better tool for the job, so it makes sense you didn't struggle with that is much. — jordanm, Mar 25 '13 at 22:38
Unfortunately we have to fully answer the question. While it may be homework, and asking for the answer is considered cheating, if we only provide a 'nudge', someone else in the future might really need the full answer. — Patrick, Mar 25 '13 at 23:11

Gilles · Answer 1 · 2013-03-25 22:58:57Z

Awk is really the natural tool here: /etc/passwd consists of colon-separated fields with the same layout on every line, which is exactly what awk is built to parse.

If you want to work with sed, the basic idea is to capture each field in a parenthesised group and use a backreference to refer to the content of each field. For example, here's how to change users' shell to zsh whenever it was bash before.

sed 's~^\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):/bin/bash$~\1:\2:\3:\4:\5:\6:/bin/zsh~'

I used ~ as the delimiter; it's usual to use /, but you can use another character, and it's more convenient to use a different character when / appears in the pattern. ~ or # or ! are common choices; for your sanity, don't pick a character that has a special meaning in regexes.

The regex contains 6 times \([^:]*\):, which matches a field (a sequence of characters other than :) and a field delimiter. I put each field in a separate group for convenience. Since the first 6 fields are not going to change, I could have put all of them in a single group, and even the beginning of the last field which doesn't change.

sed 's~^\([^:]*:[^:]*:[^:]*:[^:]*:[^:]*:[^:]*:/bin/\)bash$~\1zsh~'

Furthermore, since the number of fields is fixed, the shell is the last field, we don't need to count the number of fields. So we could write this program in a simpler but less clear way:

sed 's~^\(.*:/bin/\)bash$~\1zsh~'

Or we could drop the ^ and replace only the last part of the line.

sed 's~:/bin/bash$~:/bin/zsh~'

Beware that this kind of ad hoc simplification can make the regex clearer while making the intent less evident.

When you need to act on lines that match certain conditions, there are two basic approaches. One is to match the whole line and break it into parts using grouping, as we did above. Another approach is to restrict the s command to lines that match a certain pattern. The second approach can be more readable when the condition isn't directly related to the pattern replacement. Here's an example based on this principle: for every user whose default shell is bash, change their group to 2981.

sed '/:\/bin\/bash$/ s~^\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\)$~\1:\2:\3:2981:\5:\6:\7'

If you have several replacements to make, you can use multiple commands: pass each one as an argument to the -e option. (Most sed implementations also allow a single argument with newline-separated commands; some of them also allow semicolons as command separators.) Note that the commands are applied in turn to every line, so the result of the first command is matched against the second.

sed -e '/:\/bin\/bash$/ s~^\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\)$~\1:\2:\3:2981:\5:\6:\7' \
    -e '/:\/bin\/tcsh$/ s~^\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\):\([^:]*\)$~\1:\2:\3:1989:\5:\6:\7' \

Sukminder · Answer 2 · 2013-03-26 07:44:13Z

@Gilles has a lot of good to say. Here are some other notes:

sed is a stream editor (s)tream(ed)itor. Read introduction section from Wikipedia. Important part is things like pattern space etc.

For the most part I assume you know Regexp and wont go much in detail on that.

This became a bit long, but OK.

The "easy" part is to replace GID for users compared to shell. This is in this first section. The more interesting part is translating first letter of account/user-name and padding it to GID. That would be section two sed - lookup tables below – finishing off with listing 6 which have a more or less functional procedure for alpha to digit in GID.

A lot of this might seem "why oh WHY?" – but it is a good training in concepts (in lack of a better word).

Section 1: Swapping GID by shell

You could add a function to get GID for a named group ,here using sed instead of cut, IFS or other "easier" ways:

#!/bin/bash

get_gnr()
{
    # -n    Do not print unless I say so.
    # s///  Substitute lines beginning with argv 1:
    # p     Print if there was a substitution.
    # $1    Arg 1 to bash function.
    sed -n 's/^'$1':[^:]*:\(.*\):/\1/p' /etc/group
}

# Assign what ever get_gnr() prints to gr_pulse
gnr_bash=$(get_gnr "bash")
gnr_tcsh=$(get_gnr "tcsh")

printf "Group %5s = %d\n" "bash" "$gnr_bash"
printf "Group %5s = %d\n" "tcsh" "$gnr_tcsh"

You should have more error checking. E.g. test that you actually have a group named bash.

Then you would probably have some variable to store the GID where you want to translate first alpha to tail on GID. It is unclear however, from your task description if this should be done before or after the bash/tcsh switch of groups.

Anyhow. One thing you can utilize if you wrap the sed in a bash script is to use bash variables by temporarily escaping sed. Further you can group sed commands as with awk using e.g.:

/pattern/ { exec if match}
/pattern/ ! { exec if no match }

Here is a sample showing what I mean. In this concrete example though, it becomes a bit redundant. I have also added some extra output, which can be nice while writing to clearly and quickly see what is done:

gid_tr_to_uname=121

sed '
/:\/bin\/bash$/ {
    # Add an arrow only to visualize that line has changed
    s/^/--> /p
    # Susbtitute group
    s/\(^[^:]*:[^:]*:[^:]*:\)\([^:]*\)/\1'$gnr_bash'/
}
/:\/bin\/tcsh$/ {
    # Add an arrow only to visualize that line has changed
    s/^/--> /p
    # Susbtitute group
    s/\(^[^:]*:[^:]*:[^:]*:\)\([^:]*\)/\1'$gnr_tcsh'/
}
/[^:]*:[^:]*:[^:]*:'$gid_tr_to_uname':/ {
    # Insert line to visualize change [ old/new ]
    i\
tr group alpha name [
    p
    s/a\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/a\11\3/
    s/b\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/b\12\3/
    s/c\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/c\13\3/
    s/d\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/d\14\3/
    s/e\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/e\15\3/
    s/f\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/f\16\3/
    s/g\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/g\17\3/
    s/h\([^:]*:[^:]*:[^:]*:[^:]*\)\([0-9]\)\(:[^:]*:[^:]*:.*\)/h\115\3/
    # ....
    # Append line to visualize end
    a\
]
}
' "$in_file"

The alpha thing doesn't look to nice - whereby section two below.

If you could use bash over sed one could simplify the alpha translation by piping the result to a bash loop where IFS (like FS or field separator in awk) is set to ::

#      capture group 1             capture group 2
#   s (everything before gid) gid (everything after gid) trigger / \1 new gnr \2
sed \
-e 's/\(^[^:]*:[^:]*:[^:]*:\)[^:]*\(.*:\/bin\/bash$\)/\1'$gnr_bash'\2/' \
-e 's/\(^[^:]*:[^:]*:[^:]*:\)[^:]*\(.*:\/bin\/tcsh$\)/\1'$gnr_tcsh'\2/' \
"$1" |
while IFS=: read account password uid gid gecos directory shell; do
    case "$gid" in
    "$gid_tr_to_uname")
        gid=$(translate "$account" "$gid")
    esac
    printf "%s:%s:%d:%d:%s:%s:%s\n"\
        "$account" "$password" "$uid" "$gid" "$gecos" "$directory" "$shell"
done

And some translate function as in:

ascii_a=$(printf "%d" "'a")
ascii_A=$(printf "%d" "'A")
translate()
{
    local first_letter="${1:0:1}"    # First character in arg 1
    local -i gid_lhs="${2:0: -1}"    # Everything but last digit in arg 2
                                     # Get ascii 10 base value / digit
    local -i ascii_val=$(printf "%d" "'$first_letter")
    local -i alphanr                 # a=1 b=2, A=27 etc
    if (( $ascii_val >= ascii_a )); then
        (( alphanr = ascii_val - ascii_a + 1 ))
    else
        (( alphanr = ascii_val - ascii_A + 27 ))
    fi
    # If you want to debug:
    # printf "[[[%s = %d => %d || %d ]]]"\
    #        "$first_letter" "$ascii_val" "$alphanr" "$gid_lhs"
    printf "%d%d" "$gid_lhs" "$alphanr"
}

But then one could also easily add a case switch for shell as well and sed goes completely out of the picture.

In sed you also have a tr like functionality by y:

sed '/0x[0-9a-zA-Z]*/ y/abcdef/ABCDEF' file

But it has to be even pairs so you can't use this for a -> 1, ... p- > 16 etc.

Section 2: sed - lookup tables

By far, the only way I can think of appending first letter of account to GID is by lookup table.

To simplify I'm taking this in stages:

Listing 1

#!/bin/bash

listing1()
{
sed '
    # Pad line with lookup table
    s/$/0zero1one2two3three4four5five6six7seven8eight9nine/

    # Match something (here 1) and match it again in lookup-table
    # and grab the letters following 1 (in lookup-table) to match
    # group 2. Finally replace \1 with \2
    s/\(.\).*\1\([^0-9]*\).*/\2/
    #   |   | |     |     |   |
    #   |   | |     |     |   +----- Replace all with \2 which is "one".
    #   |   | |     |     +--------- Rest of line "2two3three4fo...".
    #   |   | |     +--------------- Match the word "one" and add it to
    #   |   | |                      group \2
    #   |   | +--------------------- Match group \1 => "1"
    #   |   |                        here in lookup-table : "1one"
    #   |   +----------------------- Match greedy => "23450zero"
    #   +--------------------------- Match one chr, that would be "1" from
    #                                input "12345\n", and add it to group \1

' < <(printf "12345\n" )
}

printf "Listing 1:\n"
listing1

Result:

Listing 1:
one

The idea is to pad our line with a lookup-table and replace first match in input with corresponding pair in table.

We can expand this by repeating the substitution:

Listing 2

listing2()
{
sed '
    s/$/.0zero1one2two3three4four5five6six7seven8eight9nine/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3\2\4/
' < <(printf "12345\n" )
}

Result:

Listing 2:
onetwothreefourfive.0zero6six7seven8eight9nine

But this doesn't look much better then what we started with in section one.

Labels / Branches

Here is where labels come in. In sed one can specify labels, or branches, and jump to these based on two functions:

:my_label
     s/foo/bar/
     b my_label

The b simply say jump to my_label. In this example that would mean an eternal loop. Thus mostly this is used as in:

:my_label
/\./ {          # If . exists in line
     s/#/+/     # substitute # with +
     s/\./P/    # substitute . with P
     b my_label # goto my_label
}

Not the best example but hopefully you get the idea.

The second way is using test or t. This say if line changed then go to label.

:my_label
     s/foo/bar/   # Substitute foo with bar
     t my_label   # If there was a change aka; a substitution was done
                  # then goto my_label.

By this we can simplify our previous listing as follows. Here with added comma to make it more pleasant to read:

Listing 3

listing3()
{
sed '
    s/$/.0zero1one2two3three4four5five6six7seven8eight9nine/
:loop
    s/\([0-9]\)\(.*\)\1\([^0-9]*\)\(.*\)/\3,\2\4/
    t loop    # If we has a substitution goto loop

    s/,\..*// # Remove trailing comma and our lookup table rest.

' < <(printf "123458\n" )
}

Result:

Listing 3:
one,two,three,four,five,eight

We want alpha to digit. Also usig dot as separator can be somewhat risky as our input can have a . in it – so we change it to use ASCII 0x7f, or DEL.

It also works with e.g. 0x00

Listing 4

listing4()
{
sed '
    p # Print original line to visualize

    # Our new lookup-table:
    s/$/\x7fa1b2c3d4e5f6g7h8i9j10k11l12m13n14o15p16q17r18s19t20u21v22w23x24y25z26/
:loop
    s/\([a-z]\)\(.*\)\1\([^a-z]*\)\(.*\)/\3,\2\4/
    t loop

    s/,\x7f.*//
' < <(printf "abcdefghijklmnopqrstuvwxyz\n" )
}

Result:

Listing 4:
abcdefghijklmnopqrstuvwxyz
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26

If we had more then one of each we would change it to something like this:

Listing 5

listing5()
{
sed '
    i\
input:
    p
    s/$/\x7fa1b2c3d4e5f6/
:loop
    s/\([a-z]\)\(.*\)\1\([^a-z]*\)\(.*\)/\3,\2\1\3\4/g
    t loop
    i\
output:
    s/,\x7f.*//
' < <(printf "aabcdefac\n" )
}

Result:

Listing 5:
input:
aabcdefac
output:
1,1,2,3,4,5,6,1,3

Now we are finally ready to implement it in the task. Here by example:

Listing 6

listing6()
{
sed '
    i\
input:
    p
    s/$/\x7fa1b2c3d4e5f6g7h8i9j10k11l12m13n14o15p16q17r18s19t20u21v22w23x24y25z26/
    s/^\(.\)\([^:]*\)\(:[^:]*\)\(:[^:]*\)\(:[^:]*\)\([0-9]\)\(:.*\)\x7f.*\1\([^a-z]*\).*/\1\2\3\4\5\8\7/
    #  1alpha 2rest    3pwd      4uid      5gid    6last-digit 7rest           8number
    i\
output:
    s/,\x7f.*//
' < <(printf "master:power:110:118:Light Display Manager:/var/lib/lightdm:/bin/false\n" )
}

Output:

Listing 6:
input:
master:power:110:118:Light Display Manager:/var/lib/lightdm:/bin/false
output:
master:power:110:1113:Light Display Manager:/var/lib/lightdm:/bin/false

Thats it.

You should read Bruce Barnett's sed introduction.

Other refs:

For some more hard-core things look at e.g.:

Greg Ubben's sed dc. With a short explanation.
Sed tetris with along going bash wrapper.

Good luck.

OK. Would be nice if the one voting down had decency enough to leave a comment on why. — Sukminder, Mar 26 '13 at 7:47

asked	1 year ago
viewed	1033 times
active	1 year ago

current community

your communities

more stack exchange communities

Using sed to process passwd file

2 Answers 2

Section 1: Swapping GID by shell

Section 2: sed - lookup tables

Listing 1

Listing 2

Labels / Branches

Listing 3

Listing 4

Listing 5

Listing 6

Your Answer

Not the answer you're looking for? Browse other questions tagged bash sed or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Using sed to process passwd file

2 Answers 2

Section 1: Swapping GID by shell

Section 2: sed - lookup tables

Listing 1

Listing 2

Labels / Branches

Listing 3

Listing 4

Listing 5

Listing 6

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged bash sed or ask your own question.

Related

Hot Network Questions