Newest 'pandas python' Questions

2

votes

1answer

15 views

How to change the x-axis when plotting groups from a pandas groupby combined in one plot

I am processing a chatlog and my data consists of timestamps, usernames and messages. My goal is to plot the number of messages per month for several users, so that I can compare when users were ...

asked 1 hour ago

ndldd
133

1

vote

2answers

25 views

Rename pandas columns with datetime objects

I have a dataframe with unhelpful column names that I'd like to turn into datetimes. The current column names are Index([Market Median, Market Median, Market Median, Market Median, Market Median, ...

python pandas

asked 3 hours ago

rauparaha
153

0

votes

2answers

23 views

Extracting rows for a Pandas dataframe in Python

I have imported a simple query log into a pandas dataframe in Python (see image), and would like to know what the most efficient way is to extract all of the rows that contain any given keyword that ...

python data data.frame pandas

asked 4 hours ago

user7289
6231824

0

votes

0answers

32 views

Python 2.7 - statsmodels - formatting and writing summary output

I'm doing logistic regression using pandas 0.11.0(data handling) and statsmodels 0.4.3 to do the actual regression, on Mac OSX Lion. I'm going to be running ~2,900 different logistic regression ...

python python-2.7 pandas statsmodels

asked 10 hours ago

FortyLashes
515

1

vote

1answer

24 views

Python Pandas - Removing Rows From A DataFrame Based on a Previously Obtained Subset

I'm running Python 2.7 with the Pandas 0.11.0 library installed. I've been looking around a haven't found an answer to this question, so I'm hoping somebody more experienced than I has a solution. ...

python pandas

asked 11 hours ago

FortyLashes
515

1

vote

1answer

33 views

Repeated measures transform in Pandas

Let's say I have data set from a repeated measures study, which looks like this: control dose_high dose_low gender participant 0 4 6 4 m 1 1 3 ...

python pandas

asked 13 hours ago

guyrt
47049

0

votes

1answer

50 views

Reference previous row when iterating through dataframe

Is there a simple way to reference the previous row when iterating through a dataframe? In the following dataframe I would like column B to change to 1 when A > 1 and remain at 1 until A < -1, ...

python pandas

asked 20 hours ago

user2350235
1

1

vote

2answers

34 views

getting specific median from data

I have a DataFrame with columns time, latitude, and longitude. It looks like this: >>> df.head() time latitude longitude 0 2011-12-16 08:09:07 42.386391 -71.013544 1 ...

python pandas max

asked 21 hours ago

Ryan Saxe
655113

1

vote

1answer

48 views

Groupby Clause in Pandas

I am trying to find the GroupBy Clause (in PANDAS DATAFRAME) which can do following things. InPlace Transformation. Add All the Money If possible then to get the Original Dataframe with Columns "A" ...

python group-by pandas

asked yesterday

Bhupendrasinh Thakre
9018

2

votes

1answer

45 views

Assign to selection in pandas

I have a pandas dataframe and I want to create a new column, that is computed differently for different groups of rows. Here is a quick example: import pandas as pd data = {'foo': list('aaade'), ...

python pandas

asked yesterday

uuazed
898

0

votes

1answer

25 views

how to get the average of dataframe column values

A B DATE 2013-05-01 473077 71333 2013-05-02 35131 62441 2013-05-03 727 27381 2013-05-04 481 1206 2013-05-05 ...

python pandas dataframes

asked yesterday

zaphod
40429

0

votes

1answer

26 views

Using boolean masks in Pandas

This is probably a trivial query but I can't work it out. Essentially, I want to be able to filter out noisy tweets from a dataframe below <class 'pandas.core.frame.DataFrame'> Int64Index: ...

python boolean pandas mask

asked yesterday

elksie5000
144111

1

vote

1answer

40 views

Pandas group by operations on a data frame

I have a pandas data frame like the one below. UsrId JobNos 1 4 1 56 2 23 2 55 2 41 2 5 3 78 1 25 3 1 I group by the data frame ...

python pandas

asked yesterday

Anirudh Nair
495

0

votes

0answers

38 views

pandas dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe

I have a multiindex dataframe from which I am dropping columns using df.drop(col,axis=1). Then, I am looking through column.levels[0] and doing some operations on all the columns. However, when I try ...

python pandas

asked yesterday

Amol Desai
224

1

vote

1answer

29 views

Pandas: Create new dataframe that averages duplicates from another dataframe

Say I have a dataframe my_df with column duplicates, e..g foo bar foo hello 0 1 1 5 1 1 2 5 2 1 3 5 I would like to create another dataframe that averages the duplicates: foo bar ...

python pandas

asked yesterday

user815423426
4,53011368

0

votes

1answer

29 views

MySQL `Load Data Infile Local` fails for .csv unless I open and save the file first. How can I avoid this step?

I generate .csv files using a python script by writing a pandas DataFrame to_csv, using utf8 encoding. consEx.to_csv(os.path.join(base_dir, "Database/Tables/Consumption ...

python mysql file-upload csv pandas

asked yesterday

Stefan Jansen
176

2

votes

2answers

40 views

Specifying date format when converting with pandas.to_datetime

I have data in a csv file with dates stored as strings in a standard UK format - %d/%m/%Y - meaning they look like: 12/01/2012 30/01/2012 The examples above represent 12 January 2012 and 30 January ...

python datetime pandas

asked 2 days ago

cms_mgr
342111

1

vote

1answer

44 views

Pandas Panel Slicing - Improving Performance

All, I'm currently using a panel in pandas to hold my data source. My program is a simple backtesting engine. It is only for personal amusement, however, I'm getting stuck in optimizing it. The ...

python pandas

asked 2 days ago

Eduardo Sahione
294

1

vote

3answers

45 views

Horizontal stacked bar chart in Matplotlib

I'm trying to create a horizontal stacked bar chart using matplotlib but I can't see how to make the bars actually stack rather than all start on the y-axis. Here's my testing code. fig = ...

python matplotlib pandas

asked 2 days ago

Jamie Bull
81415

3

votes

0answers

48 views

Merge on single level of MultiIndex

Is there any way to merge on a single level of a MultiIndex without resetting the index? I have a "static" table of time-invariant values, indexed by an ObjectID, and I have a "dynamic" table of ...

python pandas

asked May 20 at 13:45

Johann Hibschman
31316

0

votes

2answers

54 views

pandas convert strings to float for multiple columns in dataframe

I'm new to pandas and trying to figure out how to convert multiple columns which are formatted as strings to float64's. Currently I'm doing the below, but it seems like apply() or applymap() should ...

python pandas

asked May 20 at 6:23

user1507844
657

1

vote

1answer

57 views

HDF5 taking more space than CSV?

Consider the following example: Prepare the data: import string import random import pandas as pd matrix = np.random.random((100, 3000)) my_cols = [random.choice(string.ascii_uppercase) for x in ...

python pandas hdf5 pytables

asked May 19 at 21:57

user815423426
4,53011368

0

votes

0answers

20 views

Unable to save DataFrame to HDF5 (“object header message is too large”)

I have a DataFrame in Pandas: In [7]: my_df Out[7]: <class 'pandas.core.frame.DataFrame'> Int64Index: 34 entries, 0 to 0 Columns: 2661 entries, airplane to zoo dtypes: float64(2659), object(2) ...

python pandas hdf5 pytables

asked May 19 at 21:09

user815423426
4,53011368

-2

votes

2answers

96 views

Printing all values to a .txt file in Python

I wrote a small script that pulls some unnecessary columns from a text file that I'm working with. I'm not sure how to get it to print to a text file without loss of data. import pandas as pd from ...

python pandas

asked May 19 at 20:48

Matt
235

1

vote

1answer

39 views

Iteratively writing to HDF5 Stores in Pandas

Pandas has the following examples for how to store Series, DataFrames and Panelsin HDF5 files: Prepare some data: In [1142]: store = HDFStore('store.h5') In [1143]: index = date_range('1/1/2000', ...

python io pandas hdf5 pytables

asked May 19 at 17:14

user815423426
4,53011368

2

votes

1answer

135 views

Pandas: reshaping data

I have a pandas Series which presently looks like this: 14 [Yellow, Pizza, Restaurants] ... 160920 [Automotive, Auto Parts & Supplies] 160921 [Lighting Fixtures & ...

python pandas category vectorization

asked May 19 at 17:03

N. McA.
443111

1

vote

1answer

34 views

Sublists in pandas

I've got a Pandas DataFrame, one of the columns of which looks like this: 0 {u'funny': 2, u'useful': 0, u'cool': 0} 1 {u'funny': 370, u'useful': 487, u'cool': 296} 2 ...

python pandas

asked May 19 at 13:04

N. McA.
443111

1

vote

1answer

31 views

pandas read_csv end of section flag

Is there a smart/easy way to tell read_csv in pandas not to load data after a certain "end of section" flag? Or for it to stop if it gets to an empty row? data = pd.read_csv(path, **params) eos_line ...

python pandas

asked May 18 at 22:18

user1507844
657

1

vote

1answer

44 views

Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone

You can use the function tz_localize to make a Timestamp or DateTimeIndex timezone aware, but how can you do the opposite: how can you convert a timezone aware Timestamp to a naive one, while ...

python pandas

asked May 18 at 20:51

joris
2,1681520

0

votes

2answers

51 views

HDF5 and SQLite. Concurrency, compression & I/O performance [closed]

I have read in different places that SQlite does not play nicely with NFS, in particular when you want multiple processes from different machines trying to write to the database. I need a storage ...

python sqlite pandas hdf5

asked May 18 at 19:46

user815423426
4,53011368

0

votes

1answer

47 views

Pandas: Period object to abstract from time

I have the following DataFrame: df = pd.DataFrame({ 'Trader': 'Carl Mark Carl Joe Mark Carl Max Max'.split(), 'Share': list('ABAABAAA'), 'Quantity': [5,2,5,10,1,5,2,1] }, index=[ ...

python pandas period

asked May 18 at 15:47

Andy
314

1

vote

1answer

30 views

Do non-unique indexes provide any performance advantage in pandas?

From the pandas documentation, I've gathered that unique-valued indices make certain operations efficient, and that non-unique indices are occasionally tolerated. From the outside, it doesn't look ...

python performance index pandas binary-search

asked May 18 at 15:44

ChrisB
80219

1

vote

1answer

29 views

Pandas: What are the cases when count returned by DataFrame describe is a floating point

When describing my Pandas dataframe: i get the following result: Mains_1_Power Mains_2_Power count 17.000000 17.000000 mean 57.063528 200.428607 std 67.605151 ...

python data.frame pandas

asked May 18 at 7:51

Nipun Batra
422417

0

votes

1answer

30 views

Combine sparsely populated columns of the same data in pandas

I have the following dataframe and I would like to combine columns 2,3,4,5 into just one column. | 0 | 1 | 2 | 3 | 4 | 5 | +-----+-----+-----+-----+-----+-----+ | 90 | 90 | A | | ...

python table pandas

asked May 18 at 5:10

kentwait
304

0

votes

1answer

75 views

Concatenating and sorting thousands of CSV files

I have thousands of csv files in disk. Each of them with a size of approximately ~10MB (~10K columns). Most of these columns hold real (float) values. I would like to create a dataframe by ...

python pandas

asked May 18 at 0:48

user815423426
4,53011368

1

vote

1answer

43 views

pandas: how to select by partial label in index

Having a series like this: ds = Series({'wikipedia':10,'wikimedia':22,'wikitravel':33,'google':40}) google 40 wikimedia 22 wikipedia 10 wikitravel 33 dtype: int64 I would like to ...

python pandas

asked May 17 at 20:30

ronszon
1006

1

vote

2answers

45 views

Deleting all columns except a few python-pandas

Say I have a data table 1 2 3 4 5 6 .. n A x x x x x x .. x B x x x x x x .. x C x x x x x x .. x And I want to slim it down so that I only have, say, columns 3 ...

python pandas

asked May 17 at 19:00

Matt
235

2

votes

3answers

52 views

Getting data from .csv file python (panda)

I am working on a python project where I have a .csv file like this. freq,ae,cl,ota 825,1,2,3 835,4,5,6 850,10,11,12 880,22,23,24 910,46,47,48 960,94,95,96 1575,190,191,192 1710,382,383,384 ...

python csv numpy pandas

asked May 17 at 17:20

Laplace
588

-2

votes

0answers

51 views

Why accessing larger index in pandas series takes longer? [closed]

I have a function that I would like to apply element-wise to several series. def my_fun(s1, s2, p1, p2, p3, angle_cutoff, s_cutoff): a1 = xy2angle(p1, s1) a2 = xy2angle(p2, s2) if ...

python performance numpy index pandas

asked May 17 at 10:11

Roman
4,7732590149

4

votes

2answers

69 views

Is there universal if function in numpy?

I have three series. I need to do the following operation element-wise: Compare values from the first and second series. If first is larger take arc-sinus of the element from the third series. ...

python numpy scipy pandas series

asked May 17 at 10:02

Roman
4,7732590149

0

votes

1answer

27 views

How to define a function in pandas that takes series as an argument?

I have a data frame and I want to create a new column whose values are defined by values located in other columns (in the same row). It is very simple if I use simple operations (+, -, * and even ...

python pandas series dataframes

asked May 17 at 8:50

Roman
4,7732590149

1

vote

2answers

62 views

What is the most idiomatic way to index an object with a boolean array in pandas?

I am particularly talking about Pandas version 0.11 as I am busy replacing my uses of .ix with either .loc or .iloc. I like the fact that differentiating between .loc and .iloc communicates whether I ...

python pandas

asked May 17 at 7:32

snth
533317

1

vote

1answer

40 views

Stop pandas plot from doing new x-axis layout

I have a problem of an automatic x-axis rescaling happening when I do the following: plot column 1 plot column 1 where column 2 is notnull, but with different style. The second plot keeps ...

python matplotlib pandas

asked May 17 at 0:55

K.-Michael Aye
643218

1

vote

1answer

37 views

Python 3.3 pandas, pip-3.3

So, I'm trying to install pandas for Python 3.3 and have been having a really hard time- between Python 2.7 and Python 3.3 and other factors. Some pertinent information: I am running Mac OSX Lion ...

python pandas pip

asked May 16 at 23:58

FortyLashes
515

0

votes

1answer

61 views

Reading Files in HDFS (Hadoop filesystem) directories into a Pandas dataframe

I am generating some delimited files from hive queries into multiple HDFS directories. As the next step, I would like to read the files into a single pandas dataframe in order to apply standard ...

python hadoop pandas hdfs

asked May 16 at 21:47

SetJmp
3,27662554

1

vote

1answer

29 views

Appending to an empty data frame in Pandas?

Is it possible to append to an empty data frame that doesn't contain any indices or columns? I have tried to do this, but keep getting an empty dataframe at the end. e.g. df = pd.DataFrame() data = ...

python pandas

asked May 16 at 20:52

ericmjl
626

0

votes

2answers

54 views

Having issues reading a .csv file python-pandas

I'm trying to read this .txt file in pandas and this is my result. I thought (naively) that I was getting a hang of this stuff last night, but I'm wrong apparently. If I simply run rebull = ...

python pandas

asked May 16 at 20:00

Matt
235

2

votes

1answer

57 views

Pandas Convert 'NA' to NaN

I just picked up Pandas to do with some data analysis work in my biology research. Turns out one of the proteins I'm analyzing is called 'NA'. I have a matrix with pairwise 'HA, M1, M2, NA, NP...' on ...

python pandas bioinformatics

asked May 16 at 19:48

ericmjl
626

0

votes

2answers

54 views

rename index of a pandas dataframe

I have a pandas dataframe whose indices look like: df.index ['a_1', 'b_2', 'c_3', ... ] I want to rename these indices to: ['a', 'b', 'c', ... ] How do I do this without specifying a dictionary ...

python pandas

asked May 16 at 15:43

user1486457
4

-2

votes

1answer

45 views

Pivoting duplicate columns into rows

This is the input file I have from reading a csv file: Sample Info D3S1358 1 D3S1358 2 TH01 1 TH01 2 D21S11 1 D21S11 2 D21S11 3 TEST_646 17 ...

python pandas

asked May 16 at 15:41

Danial Tz
456

Tagged Questions

Community Bulletin

Related Tags