I have a string like

"aaa,aaa,aaa,bbb,bbb,ccc,bbb,ccc"

I want to remove duplicate word from string then output will be like

"aaa,bbb,ccc"

I tried This code Source

$ echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs

It is working fine with same value,but when I give my variable value then it is showing all duplicate word also.

How can I remove duplicate value.

UPDATE

My question is adding all corresponding value into a single string if user is same .I have data like this ->

   user name    | colour
    AAA         | red
    AAA         | black
    BBB         | red
    BBB         | blue
    AAA         | blue
    AAA         | red
    CCC         | red
    CCC         | red
    AAA         | green
    AAA         | red
    AAA         | black
    BBB         | red
    BBB         | blue
    AAA         | blue
    AAA         | red
    CCC         | red
    CCC         | red
    AAA         | green

In coding I fetch all distinct user then I concatenate color string successfully .For that I am using code -

while read the records 

    if [ "$c" == "" ]; then  #$c I defined global
        c="$colour1"
    else
        c="$c,$colour1" 
    fi

When I print this $c variable i get the output (For User AAA)

"red,black,blue,red,green,red,black,blue,red,green,"

I want to remove duplicate color .Then desired output should be like

"red,black,blue,green"

For this desired output i used above code

 echo "zebra ant spider spider ant zebra ant" | xargs -n1 | sort -u | xargs

but it is displaying the output with duplicate values .Like

"red,black,blue,red,green,red,black,blue,red,green," Thanks

share|improve this question
3  
Please clarify what is wrong with what you are using. I don't understand what you mean by "when I give my variable value". What value do you give? Where does it fail? – terdon Mar 23 at 12:57
    
echo 'aaa aaa aaa bbb bbb ccc bbb ccc' | xargs -n1 | sort -u | xargs gives aaa bbb ccc.. so you need to show exact code you tired and output you got.. with the string in variable: s='aaa aaa aaa bbb bbb ccc bbb ccc'; echo "$s" | xargs -n1 | sort -u | xargs – Sundeep Mar 23 at 13:01
    
string value comes dynamically. It is printing same value (contain duplicate value). – Urvashi Mar 23 at 13:02
1  
yeah, show the code that failed, otherwise how would we know what could've gone wrong? – Sundeep Mar 23 at 13:02
    
Does the order matter? – Jacob Vlijm Mar 23 at 14:06
up vote 3 down vote accepted

One more awk, just for fun:

$ a="aaa bbb aaa bbb ccc aaa ddd bbb ccc"
$ echo "$a" | awk '{for (i=1;i<=NF;i++) if (!a[$i]++) printf("%s%s",$i,FS)}{printf("\n")}'
aaa bbb ccc ddd 

By the way, even your solution works fine with variables:

$ b="zebra ant spider spider ant zebra ant" 
$ echo "$b" | xargs -n1 | sort -u | xargs
ant spider zebra
share|improve this answer
    
This works for me .Thanks @George Vasiliou – Urvashi Mar 24 at 5:59
$ echo "zebra ant spider spider ant zebra ant"  | awk -v RS="[ \n]+" '!n[$0]++' 
zebra
ant
spider
share|improve this answer
1  
Very clever!!!! – George Vasiliou Mar 24 at 0:54
    
@GeorgeVasiliou, thank you [or to tell the truth, very lazy :-) ] – JJoao Mar 24 at 8:44

With tr, sort and uniq

echo "zebra ant spider spider ant zebra ant" | tr ' ' '\n' | sort | uniq

or

echo "zebra ant spider spider ant zebra ant" | tr ' ' '\n' | sort | uniq | xargs 

to get one line

share|improve this answer
    
You need to add | xargs to join the output to one line again – Philippos Mar 23 at 12:59
    
thanks, updated my answer – Michael D. Mar 23 at 13:02
3  
Or use sort -u. Or even a awk '!u[$0]++. – Benoît Mar 23 at 18:42
1  
@Benoît Wow, I did not know about sort -u. I've been using sort | uniq all this time. The wasted keystrokes... – gardenhead Mar 24 at 1:25

With gnu sed:

sed ':s;s/\(\<\S*\>\)\(.*\)\<\1\>/\1\2/g;ts'

You may add ;s/ */ /g to remove dublicate spaces.

Functions like this: If a word is a second time in this line, remove it and start over until no dublication is found anymore.

share|improve this answer
    
What are \< and \>? – someonewithpc Mar 23 at 20:19
    
@someonewithpc They match no character, but the beginning and end of a word to prevent substrings from being matched. – Philippos Mar 23 at 21:29
    
Nice, but is that portable? Also, aren't words separated by whitespace? Seems redundant to match not whitespace followed by the end of a word. – someonewithpc Mar 23 at 21:34
1  
@someonewithpc No, it's not standard, that's why I wrote gnu sed. The nice part is that you don't have to handle first and last string separately – Philippos Mar 23 at 21:44
perl -lane '$,=$";print grep { ! $h{$_}++ } @F'
share|improve this answer

Obligatory awk solution:

$ echo "ant zebra ant spider spider ant zebra ant" | 
   awk -vRS=" " -vORS=" " '!a[$1] {a[$1]++} END{ for (x in a) print x;  } ' ; echo
zebra ant spider 

(The final echo is there for the newline)

share|improve this answer
    
Plus one for the awk ! I was builting also an awk solution just for fun. There is a slight possibility words to be printed in random order at END section due to the random way that awk itterates in array keys. – George Vasiliou Mar 23 at 14:14
    
Yes, they will be printed in an essentially random order. The sort solution doesn't keep the original order either, though. – ilkkachu Mar 23 at 14:17
    
Yes, good point! Even sort prints in different order than input. – George Vasiliou Mar 23 at 14:18
    
@ilkkachu Actually we don't need to wait for the input to end. We can make decision to print or not to print with a slight modification to your code: awk -vRS=" " -vORS=" " '!a[$1]++ {print $1}' ; echo This preserves the order. – Rakesh Sharma Mar 23 at 14:31

Python

#!/usr/bin/python
# get_unique_words.py

import sys

l = []
for w in sys.argv[1].split():
  if w not in l:
     l += [ w ]
print ' '.join(l)

Make executable, then call from Bash:

$ ./get_unique_words.py "aaa aaa aaa bbb bbb ccc bbb ccc"
aaa bbb ccc
$ ./get_unique_words.py "zebra ant spider spider ant zebra ant"
zebra ant spider

Or you could implement it as a Bash function, but the syntax is messy.

get_unique_words(){ python -c "
l = []
for w in '$1'.split():
  if w not in l:
     l += [ w ]
print ' '.join(l)"
}
share|improve this answer
a="aaa aaa aaa bbb bbb ccc bbb ccc"
for item in $a
do
   echo $item
done | sort -u | (while read i; do ans="$ans $i"; done ; echo $ans)
share|improve this answer
    
Please add an explanation on how your code works and why you did this and that. – xhienne Mar 24 at 1:37

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.