-1
votes
0answers
6 views

Cannot plot Dataframe

I am trying to plot this Dataframe but it kept on giving me errors for too many indices for array. I have looked at this for hours and I still could't figure it out. I suspect it has something to do ...
0
votes
2answers
22 views

python - importing array into pandas but losing a column

import Numpy as np import pandas as pd import sklearn from sklearn.datasets import load_boston Boston1 = load_boston() Boston2 = pd.DataFrame(boston.data, columns = boston.feature_names[0:13]) ...
-1
votes
0answers
31 views

matplotlib cannot plot histogram

I am starting to use Pandas, and having some problem plotting a field of a data frame I have 2 variables a and b, initialized as follow import matplotlib.pyplot as plot column_name = ...
0
votes
2answers
26 views

Prepend values to Panda's dataframe based on index level of another dataframe

Below I have two dataframes. The first dataframe (d1) has a 'Date' index, and the 2nd dataframe (d2) has a 'Date' and 'Name' index. You'll notice that d1 starts at 2014-04-30 and d2 starts at ...
0
votes
0answers
18 views

Parallelize apply after pandas groupby

I have used rosetta.parallel.pandas_easy to parallelize apply after group by, for example: from rosetta.parallel.pandas_easy import groupby_to_series_to_frame df = pd.DataFrame({'a': [6, 2, 2], 'b': ...
0
votes
2answers
18 views

interpolate values between sample years with Pandas

I'm trying to get interpolated values for the metric shown below using Pandas time series. test.csv year,metric 2020,290.72 2025,221.763 2030,152.806 2035,154.016 Code import pandas as pd df = ...
0
votes
0answers
25 views

How to operate multilevel index data in Python?

Refer to question .json extension file + timestamp + Pandas + Python If I group some similar data by Hour and by Day with x = inputData.groupby([dtIdx.day, dtIdx.hour]).size() May I know: 1) How ...
0
votes
2answers
42 views

New Pandas Groupby API Changes

I have a dataframe, where rows have a Name, a Type, and an SLA column. The SLA column is a numerical value: 1, 2 or 3. The SLA column is specific to type, not name. I have code that creates a new ...
0
votes
1answer
21 views

Pandas Dataframe CSV export, how to prevent additional double-quote characters

I am using Pandas to process and output data for a table which is published in Wordpress I am adding HTML code to format color one column Starting with a sample Dataframe: import numpy as np import ...
0
votes
0answers
15 views

Using Quandl for Python behind a proxy

I'm posting this because I tried searching for the answer myself and I was not able to find a solution. I was eventually able to figure out a way to get this to work & I hope this helps someone ...
1
vote
1answer
22 views

SQL injection in pandas; binding list to params in SQLAlchemy

I have this SQL query: sql = "select * from table where date in {dl}" where dl is a tuple of dates. I can do the query by doing string.format(dl=...) then using read_sql_query in pandas, but I read ...
1
vote
1answer
18 views

Formatting json from pandas dataframe

I'm trying to build out a JSON file from my dataframe that looks similar to this: {'249' : [ {'candidateId': 751, 'votes':7528, 'vote_pct':0.132 }, ...
1
vote
1answer
13 views

Python Pandas: replace given character if found in column label

I have a df which looks like this: Label PG & PCF EE-LV Centre UMP FN Label Très ...
0
votes
2answers
23 views

Create a vector with values from a Numpy array selected according to criteria in a Pandas DataFrame

I am working with a pandas df that contains two columns with integers. For each data of the df, I would like to select these two integers, use them as [row,column] pairs to extract values from a ...
0
votes
0answers
38 views

create matrix with column names and row names in Python

I'm very new in Python. I want to create a m x n matrix and add names to its columns and rows. I have a list contains row names and a list contains column names. It seems that I need to use "Pandas". ...
0
votes
1answer
30 views

Python Pandas: What is the fastest way to calculate days between two date?

I want to calculate elapsed days like this: df["elapsed_days"] = df.apply(lambda x: (x.logged_day - x.registered_day).days, axis=1) The type of logged_day and registered_day is datetime.date(). It ...
0
votes
1answer
30 views

Elegant way to perform column-wise operations on two dataframe

I need to find all-pair column-wise operation on a dataframe. I came up with a naive solution but wondering if any elegant way is available. The following script counts the number rows having one ...
0
votes
1answer
26 views

Joining dataframes with different datetime frequencies

I have some sparse higher frequency data (unevenly spaced) and some low frequency data (daily). How can I join this data and append corresponding low frequency data columns to the higher frequency ...
3
votes
1answer
36 views

How to get the mode for string variable when resampling with pandas

I am trying to resample a pandas data frame with a timestamp index to an hourly occurrence. I am interested in obtaining the most frequent value for a column with string values . However the built in ...
0
votes
1answer
38 views

Pass a Variable Name to the Arguments of a Function (Python 3, Pandas)

Say I have a DF defined as variable DF1: Words Score The man 10 A Plan 20 Panama 30 And say I have a function: def func(w, df): pattern = ...
0
votes
0answers
24 views

How do I get the average monthly values for columns when I have daily indexed data?

My data looks like this. Observation_date is the index and the two columns are values corresponding to different types of observations. observation_date 0 1 2010-01-01 44.500000 ...
0
votes
1answer
17 views

Conforming dataframe dtypes between read_excel() and to_excel()

I am reading a dataframe from an excel file (specifically xlsx) that contains rows and columns about vendors, including zip_code and tax_id columns. When the numbers are read IN and then I cast the ...
0
votes
1answer
19 views

Pandas set_index Multiindex Lookup

I cannot find a way to lookup a multiindex in Pandas 0.14. Here is some mock data that I'm having trouble with. Code: row1 = ['red', 'ferrari', 'mine'] row2 = ['blue', 'ferrari', 'his'] row3 = ...
0
votes
1answer
20 views

Delete multiple Pandas DataFrame row where column value is this or that

I have a dataframe which looks like this Label Type Name ppppp ...
1
vote
1answer
18 views

pandas: split index string and sum up rows of equal slices

I want to split the row index strings of a DataFrame and sum rows of equals slices. Something like: idx val1 val2 val3 con-991-1 1 1 1 con-991-2 1 0 1 con-732 0 0 0 con-55-1 ...
0
votes
2answers
24 views

How to return a single row for rows that have duplicate values in Pandas

I would like to do it quickly, not by going from row to row as it is a rather big file. I can't find anything on pandas, although pivot_table seems to be quite close... Here is what I have: A B ...
1
vote
1answer
16 views

combination of two DF, pandas

I have two df, First df A B C 1 1 3 1 1 2 1 2 5 2 2 7 2 3 7 Second df B D 1 5 2 6 3 4 The column Bhas the same meaning in the both dfs. What is the most easy way add column D to the ...
0
votes
0answers
18 views

Reshape / merge columns in Pandas dataframe after using pivot_table

The following pivot function is working but I need to reformat results. Essentially removing the 'financialyear' label, and moving 'business' up to the same line as the financial years. df = ...
0
votes
0answers
12 views

Adding URLs to pandas DataFrame in Django

I have a pandas DataFrame containing counts that I render in a Django template with the to_html() method. I would like to add urls to the indexes and numeric data. The intention is to create GET ...
0
votes
1answer
29 views

New pandas dataframe column using values from python dictionary

I have a pandas dataframe, for example: colA colB code1 num code2 num code3 num code4 num code5 num I also have a python dictionary, for example: py_dict = {'code1': ...
0
votes
1answer
18 views

Pandas plotting options causing error in iPython [duplicate]

I am trying some of the pandas plotting stuff shown here. However whenever I try to use the following command to set style options as suggested pd.options.display.mpl_style = 'default' I get the ...
0
votes
3answers
37 views

Drop columns that aren't common between two dataframes?

I have two dataframes that have many columns in column but a few that do not exist in both. I would like to create a dataframe that only has the columns that are in common between both dataframes. ...
0
votes
1answer
24 views

Aggregating string field into a list with python pandas

I have the following data frame in pandas: >>> df1[1:15] gene beta 1 PALMD NaN 2 PALMD NaN 3 FRRS1 1.966503 4 AGL NaN 5 AGL -4.082453 6 ...
1
vote
1answer
27 views

Get the latest of each element of a Pandas DataFrame, with range indexing and a date column?

I have a sample DataFrame as such: df = pd.DataFrame(data=[('foo', datetime.date(2014, 10, 1)), ('foo', datetime.date(2014, 10, 2)), ('bar', ...
1
vote
3answers
37 views

Finding index of a pandas DataFrame value

I am trying to process some .csv data using pandas, and I am struggling with something that I am sure is a rookie move, but after spending a lot of time trying to make this work, I need your help. ...
1
vote
1answer
25 views

Understanding groupby in pandas

I'm looking to get the sum of some values in a dataframe after it has been grouped. some sample data: Race officeID CandidateId total_votes precinct Mayor 10 705 ...
1
vote
0answers
29 views

How to access data from HDF5 with hierarchical keys?

I have created a store in HDF5 with hierarchical keys with the following structure <class 'pandas.io.pytables.HDFStore'> File path: path-analysis/data/store.h5 /attribution/attr_000000 ...
2
votes
1answer
18 views

Only one column after group by()

Am switch from R to Python for most of my data analysis needs and am running into the following issue. Could be the result of my conceptual understanding of groupby(). I have a Pandas data frame and ...
1
vote
1answer
23 views

Convert row to column header for Pandas DataFrame,

The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header? I want to do ...
0
votes
1answer
17 views

Conditionally grabbing column headings in pandas dataframe

I have a pandas DataFrame with many columns and indexed by probability. Below is code that can generate a sample df import numpy as N probs = N.arange(0, 1, .1) data = N.random.random_integers(0, ...
0
votes
1answer
13 views

Python pandas split value in row evenly across all duplicated ID's

Kind of stuck on this so hopefully someone can help. Generally I have a dataframe like this df = pd.DataFrame({ "id": [1,1,1,4,5,5,7], "value": [100, 100, 100, 45, 3, 3, 42] ...
1
vote
1answer
44 views

Pandas aggregation ignoring NaN's

I aggregate my Pandas dataframe: data. Specifically, I want to get the average and sum amounts by tuples of [origin and type]. For averaging and summing I tried the numpy functions below: import ...
0
votes
0answers
23 views

applying a string function to a pandas dataframe column

this seems somewhat basic but after going through stackoverflow I couldn't seem to take everything answered and solve my problem. so i'm working on my text processing skills. I put car reviews in a ...
0
votes
0answers
16 views

Creating and graphing Hierarchical Trees in Python with pandas

So I have hierarchical information stored within a pandas DataFrame and I would like to construct and visualize a hierarchical tree based on this information. For example, a row in my DataFrame has ...
1
vote
1answer
32 views

Pandas Groupby Apply Function to Level

I have a pandas groupby object that has the following structure: SETTLE_DATE DATE 2014-09-23 2014-09-19 0.000091 2014-09-22 0.000163 2014-10-01 2014-09-29 -0.000002 ...
2
votes
1answer
23 views

Groupwise downsampling and plotting of pd.DataFrame

I am trying to downsample grouped data to daily averages, calculated for each group, and plot the resulting time series in a single plot. My starting point is the following pd.DataFrame: value ...
0
votes
0answers
22 views

merging 2 Pandas Dataframes reduces the dataset length

I want to merge the following pandas Datadrame: first_df and second_df >>> len(first_df) 813 >>> len(second_df) 813 To merge it I'm using: third_df = pd.merge(first_df, ...
0
votes
2answers
36 views

Pandas: create timestamp from 3 columns: Month, Day, Hour

I'm using Python 2.7, panda 0.14.1-2, numpy 1.8.1-1. I have to use Python 2.7 because I'm coupling it with something that doesn't work on Python 3 I'm trying to analyze a csv files that outputs ...
0
votes
1answer
22 views

round a single column in pandas

Is there a way to round a single column in pandas without affecting the rest of the dataframe? df: item value1 value2 0 a 1.12 1.3 1 a 1.50 2.5 2 a 0.10 ...
0
votes
1answer
13 views

How to Stack Data Frames on top of one another (Pandas,Python3)

Lets say i Have 3 Pandas DF DF1 Words Score The Man 2 The Girl 4 Df2 Words2 Score2 The Boy 6 The Mother 7 Df3 Words3 Score3 The Son 3 The ...