1
vote
1answer
22 views

pandas multiple plots not working as hists

with a dataframe df in Pandas, I am trying to plot histograms on the same page filtering by 3 different variables; the intended outcome is histograms of the values for each of the three types. The ...
0
votes
1answer
42 views

Python Pandas sum up values from different columns

I'm trying to take values stored in a list in one column and multiply them by values stored in a list in another column. For example, to print all all the cores for each user, I do this. print ...
0
votes
1answer
42 views

Global variables for many classes vs many equivalent class attributes?

Firstly, I realize that there are already many questions about efficiency out there, so I apologize if this is a duplicate, but I'm here because I couldn't find what I was looking for. I'm going to ...
1
vote
1answer
39 views

Pandas good approach to get top n records within each group

Suppose I have pandas DataFrame like this: >>> df = pd.DataFrame({'id':[1,1,1,2,2,2,2,3,4],'value':[1,2,3,1,2,3,4,1,1]}) >>> df id value 0 1 1 1 1 2 2 1 3 3 ...
1
vote
1answer
32 views

Pandas dataframe get first row of each group

I have a pandas DataFrame like following. df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7], 'value' : ["first","second","second","first", ...
1
vote
2answers
27 views

Pandas: drop_duplicates with condition

Is there any way to use drop_duplicates together with conditions? For example, let's take the following Dataframe: import pandas as pd df = pd.DataFrame({ 'Customer_Name': ['Carl', 'Carl', 'Mark', ...
1
vote
1answer
33 views

mapping of one column with another two

I have a Pandas Dataframe with three columns that follows this structure: Employee email Manager Smith [email protected] Johnson Doe [email protected] ...
2
votes
1answer
42 views

How to calculate a new field in python using a linear relationship

I am new to python, working with python 2.7.5, After i read a csv file in python using below code: df = csv.DictReader(open("C:\\Users\\user\\Documents\\file.csv")). I want to calculate a new ...
1
vote
1answer
27 views

Convert one row of a pandas dataframe into multiple rows

I want to turn this: age id val 0 99 1 0.3 1 99 2 0.5 2 99 3 0.1 Into this: age id val 0 25 1 0.3 1 50 1 0.3 2 75 1 0.3 3 25 2 0.5 4 50 2 0.5 5 ...
3
votes
2answers
42 views

Pandas OR statement ending in series contains

I have a DataFrame df that has columns type and subtype and about 100k rows, I'm trying to classify what kind of data df contains by checking type / subtype combinations. While df can contain many ...
1
vote
1answer
23 views

Pandas CSV file with occasional extra columns in the middle

I'm processing lots (thousands) of ~100k line csv files that are produced by someone else. 9 times out of 10 the files have 8 columns and all is right with the world. The 10th time or so ~10 lines ...
1
vote
2answers
36 views

Conditional merge for CSV files using python (pandas)

I am trying to merge >=2 files with the same schema. The files will contain duplicate entries but rows won't be identical, for example: file1: store_id,address,phone 9191,9827 Park st,999999999 ...
0
votes
1answer
18 views

How can I select which multi-index axis splits the data in a groupby object across different subplots?

I'm working with a pandas.groupby object to which I have applied a function as such: x = data.groupby(['congruent', 'contrast']).apply(lambda s: s.mean())[['cresp1', 'cresp2']] Output of print x: ...
0
votes
2answers
18 views

Pandas performing a SQL subtraction between two dataframes

I have two dataframes. First there is DF1: ID Other value 1 a 2 b 3 c and then there is DF2, which is a subset of DF1: ID Other value 1 a I want ...
2
votes
1answer
35 views

PYODBC to Pandas - DataFrame not working - Shape of passed values is (x,y), indices imply (w,z)

I used pyodbc with python before but now I have installed it on a new machine ( win 8 64 bit, Python 2.7 64 bit, PythonXY with Spyder). Before I used to (at the bottom you can find more real ...
1
vote
1answer
21 views

Difference between log(dataframe) in IPython and in execution

I have a pandas data frame as an attribute in Python 2.7, called probs. If I try to execute log(self.prob['AAA']) (where AAA is a valid name for one of the columns in the data frame), I get the ...
0
votes
1answer
50 views

removing NaN values in python pandas

Data is of income of adults from census data, rows look like: 31, Private, 84154, Some-college, 10, Married-civ-spouse, Sales, Husband, White, Male, 0, 0, 38, NaN, >50K 48, Self-emp-not-inc, ...
1
vote
1answer
49 views

averaging every five minutes data as one datapoint in pandas dataframe

I have a Dataframe in Pandas like this 1. 2013-10-09 09:00:05 2. 2013-10-09 09:01:00 3. 2013-10-09 09:02:00 4. ............ 5. ............ 6. ............ 7. 2013-10-10 09:15:05 8. 2013-10-10 ...
1
vote
2answers
62 views

Suppress or remove columns named 'index' from Pandas dataframe

I am trying to create a dataframe from three parent (or source) dataframes (each created from a .csv file), but when writing the resulting dataframe to a file or printing on screen, columns named ...
0
votes
1answer
43 views

iterate sort within groupby

I would like to sort this Series within each level of col_0 import pandas as pd a = 'a b b a a a a b b'.split() b = 'b a b b b a a b b'.split() aS = pd.Series(a) bS = pd.Series(b) ctab = ...
1
vote
2answers
41 views

Pandas reindexing data frame issue

Say I have the following data frame, A B 0 1986-87 232131 1 1987-88 564564 2 1988-89 123125 ... And so on. I'm trying to reindex, with ...
1
vote
1answer
23 views

Pandas read_table() thousands=',' not working

I'm trying to read in some population data as an exercise to learn pandas: >>> countries = pd.read_table('country_data.txt', thousands=',', ...
2
votes
1answer
35 views

pandas - reading multiple JSON records into dataframe

I'd like to know if there is a memory efficient way of reading multi record JSON file ( each line is a JSON dict) into a pandas dataframe. Below is a 2 line example with working solution, I need it ...
1
vote
1answer
40 views

Filter by hour in Pandas

How can I filter a DataFrame indexed by datetime so that I get only the entries within certain hours of every day? I am looking for something equivalent to the following R code for an xts object ...
5
votes
2answers
72 views

Insert a link inside a pandas table

I'd like to insert a link (to a web page) inside a pandas table, so when it is displayed in ipython notebook, I could press the link. I tried the following: In [1]: import pandas as pd In [2]: df = ...
1
vote
2answers
37 views

Python Pandas max value of selected columns

data = {'name' : ['bill', 'joe', 'steve'], 'test1' : [85, 75, 85], 'test2' : [35, 45, 83], 'test3' : [51, 61, 45]} frame = pd.DataFrame(data) I would like to add a new column that shows ...
1
vote
1answer
35 views

python pandas text block to data frame mixed types

I am a python and pandas newbie. I have a text block that has data arranged in columns. The data in the first six columns are integers and the rest are floating point. I tried to create two DataFrames ...
1
vote
1answer
48 views

Append string to the start of each value in a said column of a pandas dataframe (elegantly)

I would like to append a string to the start of each value in a said column of a pandas dataframe (elegantly). I already figured out how to kind-of do this and I am currently using: df.ix[(df['col'] ...
0
votes
2answers
62 views

Vectorizing a Pandas dataframe for Scikit-Learn

Say I have a dataframe in Pandas like the following: > my_dataframe col1 col2 A foo B bar C something A foo A bar B foo where rows represent instances, and ...
2
votes
1answer
46 views

Interpolating a series with float index

I have the following data frame density A2 B2 0 20 1 0.525 1 30 1 0.577 2 40 1 0.789 3 50 1 1.000 4 75 1 1.000 5 100 1 1.000 I'm trying ...
1
vote
1answer
27 views

Calculating rolling_std on 4 columns in python pandas to calculate a Bollinger Band?

I'm just getting into Pandas, trying to do what I would do in excel easily just with a large data set. I have a selection of futures price data that I have input into Pandas using: df = ...
1
vote
1answer
25 views

Modifying number of ticks on Pandas hourly time axis

If I have the following example Python code using a Pandas dataframe: import pandas as pd from datetime import datetime ts = pd.DataFrame(randn(1000), index=pd.date_range('1/1/2000 00:00:00', ...
1
vote
2answers
46 views

join three pandas data frames into one?

Here is my pandas Data Frames: pandas1 = pandas.DataFrame([1,2,3,4,5,6,7,8,9]) pandas2 = pandas.DataFrame([10,20,30,40,50,60,70,80,90]) pandas3 = ...
2
votes
2answers
40 views

Trouble with using iloc in pandas dataframe with hierarchical index

I'm getting this ValueError whenever I try to give a list to iloc on a dataframe with a hierarchical index. I'm not sure if I'm doing something wrong or if this is a bug. I haven't had any issues ...
1
vote
1answer
36 views

Python Pandas isin return index

I have a pandas DataFrame df with a list of unique ids id, and a DataFrame with master list of all known ids master_df.id. I'm trying to figure out the best way to preform an isin that also returns to ...
1
vote
2answers
23 views

pandas timeseries identification values based on date index

I have a pandas 30min interval timeseries. A small sample looks like: 2009-12-02 20:00:00 0.6 2009-12-02 20:30:00 0.7 2009-12-03 01:00:00 0.7 2009-12-03 02:30:00 0.7 2009-12-03 11:30:00 ...
1
vote
1answer
34 views

python pandas: How to simplify the result of groupby('column_name').count()

quick one, imaging we have a df which contains Walmart's global sales contacts, say, 20 columns. What I want to do is every simple: figure out how many rows there are for each country. Naively, I will ...
3
votes
1answer
25 views

Combine date column and time column into datetime column

I have a Pandas dataframe like this; (obtained by parsing an excel file) | | COMPANY NAME | MEETING DATE | MEETING TIME| ...
0
votes
2answers
41 views

python datetime: How to get next period (using aliases such as 'D', 'M') [duplicate]

Is there any way to get from a date to the next period? I.e. I am looking for a funaction next that takes now = datetime.datetime(2013, 11, 15, 0, 0) to next(now, 'D') = datetime.datetime(2013, ...
2
votes
2answers
69 views

Print different precision by column with pandas.DataFrame.to_csv()?

Question Is it possible to specify a float precision specifically for each column to be printed by the Python pandas package method pandas.DataFrame.to_csv? Background If I have a pandas dataframe ...
0
votes
1answer
31 views

Calculate Daily Returns with Pandas DataFrame

Here is my Pandas data frame: prices = pandas.DataFrame([1035.23, 1032.47, 1011.78, 1010.59, 1016.03, 1007.95, 1022.75, 1021.52, 1026.11, 1027.04, 1030.58, 1030.42, ...
1
vote
1answer
42 views

How to select from different columns conditionally in pandas

I have an pandas DataFrame like shaped Nx5 ['','','A','',''] ['','C','','',''] ['','A','','',''] ['','','','T',''] . . . I want to convert it to Nx1 shape getting non-empty values ['A'] ['C'] ...
1
vote
2answers
24 views

Losing date index from dataframe in Pandas

I am trying to convert a resampled (hourly) pandas dataframe, indexed by daterun, into tuples. Here is the dataframe: ratetype p_rate v_rate daterun ...
1
vote
2answers
37 views

Use ternary operator in apply function in pandas dataframe, without grouping columns

How can I use ternary operator in the lambda function within apply function of pandas dataframe? First of all, this code is from R/plyr, which is exactly what I want to get: ddply(mtcars, .(cyl), ...
1
vote
1answer
21 views

Insert values into pandas datafrmae based on MUltiIndex

I have a MultiIndex pandas DataFrame as follows: df = pandas.DataFrame({"index": ["a", "a", "a", "b", "b", "b"], "id": [1,2,3,4,5,6], "name": ["jim", "jim", "jim", "bob", "bob", "bob"], ...
1
vote
3answers
64 views

Run an OLS regression with Pandas Data Frame

I have a pandas data frame and I would like to able to predict the values of column A from the values in columns B and C. Here is a toy example: import pandas as pd df = pd.DataFrame({"A": ...
0
votes
2answers
32 views

Pandas: How to access the value of the index

I have a dataframe and would like to use the values in the index to create another column. For instance: df=pd.DataFrame({'idx1':range(0,5), 'idx2':range(10000,10005), 'value':np.random.randn(5)}) ...
1
vote
0answers
41 views

calling apply() on an empty pandas DataFrame

I'm having a problem with the apply() method of the pandas DataFrame. My issue is that apply() can return either a Series or a DataFrame, depending on the return type of the input function; however, ...
0
votes
1answer
26 views

add offset to the datetime64 column in a data frame

this is really a quick one: i am migrating from q to pandas, i am trying to add 1 nano to each of the item in the Date column of the data frame 'spy' >>> spy <class ...
1
vote
0answers
30 views

automatically updating columns in pandas?

In my mind, pandas is providing me with a virtual spreadsheet, like Excel. One thing about Excel spreadsheets is that you can set a column to a function. For instance T_c T T_r Series ...

15 30 50 per page