How to remove duplicate columns from a dataframe using python pandas

Question

By grouping two columns I made some changes.

I generated a file using python, it resulted in 2 duplicate columns. How to remove duplicate columns from a dataframe?

Do they have same column name? – waitingkuo Jun 5 '13 at 11:38

Andy Hayden · Answer 1 · 2013-06-05 12:11:46Z

up vote 2 down vote

It's probably easiest to use a groupby (assuming they have duplicate names too):

In [11]: df
Out[11]:
   A  B  B
0  a  4  4
1  b  4  4
2  c  4  4

In [12]: df.T.groupby(level=0).first().T
Out[12]:
   A  B
0  a  4
1  b  4
2  c  4

If they have different names you can drop_duplicates on the transpose:

In [21]: df
Out[21]:
   A  B  C
0  a  4  4
1  b  4  4
2  c  4  4

In [22]: df.T.drop_duplicates().T
Out[22]:
   A  B
0  a  4
1  b  4
2  c  4

Usually read_csv will usually ensure they have different names...

edited Jun 5 '13 at 12:11

answered Jun 5 '13 at 12:05

Andy Hayden
32.6k104081

FYI @Andy, there is a new option in 0.11.1 that controls this mangle_dup_cols; default is TO mangle (e.g. produce unique cols), in 0.12, this will change to leave dups in place – Jeff Jun 5 '13 at 12:19

@Jeff ah! Thanks for the update :) good feature! – Andy Hayden Jun 5 '13 at 12:21

add comment

asked	9 months ago
viewed	392 times
active	9 months ago

current community

your communities

more stack exchange communities

How to remove duplicate columns from a dataframe using python pandas

1 Answer

Your Answer

Not the answer you're looking for? Browse other questions tagged python pandas or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

How to remove duplicate columns from a dataframe using python pandas

1 Answer

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python pandas or ask your own question.

Linked

Related

Hot Network Questions