Merging multiple lines based on column 1

Question

I have a file like below..

abc, 12345

def, text and nos

ghi, something else

jkl, words and numbers


abc, 56345

def, text and nos

ghi, something else

jkl, words and numbers


abc, 15475

def, text and nos

ghi, something else

jkl, words and numbers


abc, 123345

def, text and nos

ghi, something else

jkl, words and numbers

I want to convert it as:

abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos,text and nos,text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers,

Any help ?

Do you actually have the extra blank lines in your input file? If not, please edit and remove them, you should show the file exactly as it is. — terdon♦, Apr 11 at 14:23

Gnouc · Answer 1 · 2014-04-12 05:49:19Z

up vote 4 down vote

If you don't mind the order of output:

$ awk -F',' 'NF>1{a[$1] = a[$1]","$2}END{for(i in a){print i""a[i]}}' file 
jkl, words and numbers, words and numbers, words and numbers, words and numbers
abc, 12345, 56345, 15475, 123345
ghi, something else, something else, something else, something else
def, text and nos, text and nos, text and nos, text and nos

Explanation

NF>1 meaning we only need to process for line which is not blank.
We save all first field in the associative array a, with the key is the first field, the value is second field (or the rest of the line). If the key has already haved value, we concat two values.
In END block, we loop through the associative array a, print all its keys with corresponding value.

Or using perl will keep the order:

$perl -F',' -anle 'next if /^$/;$h{$F[0]} = $h{$F[0]}.", ".$F[1];
    END{print $_,$h{$_},"\n" for sort keys %h}' file
abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos, text and nos, text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers

edited Apr 12 at 5:49

answered Apr 11 at 4:01

Gnouc
25.2k22462

your perl solution from my question unix.stackexchange.com/questions/124181/… should also work right? – Ramesh Apr 11 at 4:08

No. The OP want to concat string based on column 1, regardless of duplicated or not. Your question doesn't want duplicated. – Gnouc Apr 11 at 4:16

oh ok. At the first glance, it seemed like almost similar to my question. :) – Ramesh Apr 11 at 4:19

1

Neat, +1! That doesn't keep the order though, it only recreates it in this particular example where the fields are in alphabetical order. – terdon♦ Apr 11 at 14:29

Just for laughs, I'd written almost exactly the same approach before reading your answer: perl -F, -lane 'next unless /./;push @{$k{$F[0]}}, ",@F[1..$#F]"; END{print "$_@{$k{$_}}" foreach keys(%k)}' file :) Great minds think alike! – terdon♦ Apr 11 at 14:43

| show 5 more comments

score 1 · Answer 2 · 2014-04-12 08:25:15Z

Oh, that's an easy one. Here's a simple version that keeps the order of the keys as they appear in the file:

$ awk -F, '
    /.+/{
        if (!($1 in Val)) { Key[++i] = $1; }
        Val[$1] = Val[$1] "," $2; 
    }
    END{
        for (j = 1; j <= i; j++) {
            printf("%s %s\n%s", Key[j], Val[Key[j]], (j == i) ? "" : "\n");       
        }                                    
    }' file.txt

Output should look like this:

abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos, text and nos, text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers

If you don't mind having an extra blank line at the end, just replace the printf line with printf("%s %s\n\n", Key[j], Val[Key[j]]);

asked	5 months ago
viewed	288 times
active	5 months ago

current community

your communities

more stack exchange communities

Merging multiple lines based on column 1

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged awk columns merge or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Merging multiple lines based on column 1

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged awk columns merge or ask your own question.

Linked

Related

Hot Network Questions