Tagged Questions
-1
votes
0answers
6 views
Cannot plot Dataframe
I am trying to plot this Dataframe but it kept on giving me errors for too many indices for array. I have looked at this for hours and I still could't figure it out.
I suspect it has something to do ...
0
votes
2answers
22 views
python - importing array into pandas but losing a column
import Numpy as np
import pandas as pd
import sklearn
from sklearn.datasets import load_boston
Boston1 = load_boston()
Boston2 = pd.DataFrame(boston.data, columns = boston.feature_names[0:13])
...
-1
votes
0answers
31 views
matplotlib cannot plot histogram
I am starting to use Pandas, and having some problem plotting a field of a data frame
I have 2 variables a and b, initialized as follow
import matplotlib.pyplot as plot
column_name = ...
0
votes
2answers
26 views
Prepend values to Panda's dataframe based on index level of another dataframe
Below I have two dataframes. The first dataframe (d1) has a 'Date' index, and the 2nd dataframe (d2) has a 'Date' and 'Name' index.
You'll notice that d1 starts at 2014-04-30 and d2 starts at ...
0
votes
0answers
18 views
Parallelize apply after pandas groupby
I have used rosetta.parallel.pandas_easy to parallelize apply after group by, for example:
from rosetta.parallel.pandas_easy import groupby_to_series_to_frame
df = pd.DataFrame({'a': [6, 2, 2], 'b': ...
0
votes
2answers
18 views
interpolate values between sample years with Pandas
I'm trying to get interpolated values for the metric shown below using Pandas time series.
test.csv
year,metric
2020,290.72
2025,221.763
2030,152.806
2035,154.016
Code
import pandas as pd
df = ...
0
votes
0answers
25 views
How to operate multilevel index data in Python?
Refer to question .json extension file + timestamp + Pandas + Python
If I group some similar data by Hour and by Day with x = inputData.groupby([dtIdx.day, dtIdx.hour]).size()
May I know:
1) How ...
0
votes
2answers
42 views
New Pandas Groupby API Changes
I have a dataframe, where rows have a Name, a Type, and an SLA column. The SLA column is a numerical value: 1, 2 or 3. The SLA column is specific to type, not name.
I have code that creates a new ...
0
votes
1answer
21 views
Pandas Dataframe CSV export, how to prevent additional double-quote characters
I am using Pandas to process and output data for a table which is published in Wordpress
I am adding HTML code to format color one column
Starting with a sample Dataframe:
import numpy as np
import ...
0
votes
0answers
15 views
Using Quandl for Python behind a proxy
I'm posting this because I tried searching for the answer myself and I was not able to find a solution. I was eventually able to figure out a way to get this to work & I hope this helps someone ...
1
vote
1answer
22 views
SQL injection in pandas; binding list to params in SQLAlchemy
I have this SQL query:
sql = "select * from table where date in {dl}"
where dl is a tuple of dates. I can do the query by doing string.format(dl=...) then using read_sql_query in pandas, but I read ...
1
vote
1answer
18 views
Formatting json from pandas dataframe
I'm trying to build out a JSON file from my dataframe that looks similar to this:
{'249' : [
{'candidateId': 751,
'votes':7528,
'vote_pct':0.132
},
...
1
vote
1answer
13 views
Python Pandas: replace given character if found in column label
I have a df which looks like this:
Label PG & PCF EE-LV Centre UMP FN
Label
Très ...
0
votes
2answers
23 views
Create a vector with values from a Numpy array selected according to criteria in a Pandas DataFrame
I am working with a pandas df that contains two columns with integers. For each data of the df, I would like to select these two integers, use them as [row,column] pairs to extract values from a ...
0
votes
0answers
38 views
create matrix with column names and row names in Python
I'm very new in Python. I want to create a m x n matrix and add names to its columns and rows. I have a list contains row names and a list contains column names. It seems that I need to use "Pandas". ...
0
votes
1answer
30 views
Python Pandas: What is the fastest way to calculate days between two date?
I want to calculate elapsed days like this:
df["elapsed_days"] = df.apply(lambda x: (x.logged_day - x.registered_day).days, axis=1)
The type of logged_day and registered_day is datetime.date().
It ...
0
votes
1answer
30 views
Elegant way to perform column-wise operations on two dataframe
I need to find all-pair column-wise operation on a dataframe. I came up with a naive solution but wondering if any elegant way is available.
The following script counts the number rows having one ...
0
votes
1answer
26 views
Joining dataframes with different datetime frequencies
I have some sparse higher frequency data (unevenly spaced) and some low frequency data (daily).
How can I join this data and append corresponding low frequency data columns to the higher frequency ...
3
votes
1answer
36 views
How to get the mode for string variable when resampling with pandas
I am trying to resample a pandas data frame with a timestamp index to an hourly occurrence. I am interested in obtaining the most frequent value for a column with string values . However the built in ...
0
votes
1answer
38 views
Pass a Variable Name to the Arguments of a Function (Python 3, Pandas)
Say I have a DF defined as variable DF1:
Words Score
The man 10
A Plan 20
Panama 30
And say I have a function:
def func(w, df):
pattern = ...
0
votes
0answers
24 views
How do I get the average monthly values for columns when I have daily indexed data?
My data looks like this. Observation_date is the index and the two columns are values corresponding to different types of observations.
observation_date 0 1
2010-01-01 44.500000 ...
0
votes
1answer
17 views
Conforming dataframe dtypes between read_excel() and to_excel()
I am reading a dataframe from an excel file (specifically xlsx) that contains rows and columns about vendors, including zip_code and tax_id columns. When the numbers are read IN and then I cast the ...
0
votes
1answer
19 views
Pandas set_index Multiindex Lookup
I cannot find a way to lookup a multiindex in Pandas 0.14. Here is some mock data that I'm having trouble with.
Code:
row1 = ['red', 'ferrari', 'mine']
row2 = ['blue', 'ferrari', 'his']
row3 = ...
0
votes
1answer
20 views
Delete multiple Pandas DataFrame row where column value is this or that
I have a dataframe which looks like this
Label Type
Name
ppppp ...
1
vote
1answer
18 views
pandas: split index string and sum up rows of equal slices
I want to split the row index strings of a DataFrame and sum rows of equals slices. Something like:
idx val1 val2 val3
con-991-1 1 1 1
con-991-2 1 0 1
con-732 0 0 0
con-55-1 ...
0
votes
2answers
24 views
How to return a single row for rows that have duplicate values in Pandas
I would like to do it quickly, not by going from row to row as it is a rather big file. I can't find anything on pandas, although pivot_table seems to be quite close... Here is what I have:
A B
...
1
vote
1answer
16 views
combination of two DF, pandas
I have two df,
First df
A B C
1 1 3
1 1 2
1 2 5
2 2 7
2 3 7
Second df
B D
1 5
2 6
3 4
The column Bhas the same meaning in the both dfs. What is the most easy way add column D to the ...
0
votes
0answers
18 views
Reshape / merge columns in Pandas dataframe after using pivot_table
The following pivot function is working but I need to reformat results. Essentially removing the 'financialyear' label, and moving 'business' up to the same line as the financial years.
df = ...
0
votes
0answers
12 views
Adding URLs to pandas DataFrame in Django
I have a pandas DataFrame containing counts that I render in a Django template with the to_html() method. I would like to add urls to the indexes and numeric data. The intention is to create GET ...
0
votes
1answer
29 views
New pandas dataframe column using values from python dictionary
I have a pandas dataframe, for example:
colA colB
code1 num
code2 num
code3 num
code4 num
code5 num
I also have a python dictionary, for example:
py_dict = {'code1': ...
0
votes
1answer
18 views
Pandas plotting options causing error in iPython [duplicate]
I am trying some of the pandas plotting stuff shown here. However whenever I try to use the following command to set style options as suggested
pd.options.display.mpl_style = 'default'
I get the ...
0
votes
3answers
37 views
Drop columns that aren't common between two dataframes?
I have two dataframes that have many columns in column but a few that do not exist in both. I would like to create a dataframe that only has the columns that are in common between both dataframes. ...
0
votes
1answer
24 views
Aggregating string field into a list with python pandas
I have the following data frame in pandas:
>>> df1[1:15]
gene beta
1 PALMD NaN
2 PALMD NaN
3 FRRS1 1.966503
4 AGL NaN
5 AGL -4.082453
6 ...
1
vote
1answer
27 views
Get the latest of each element of a Pandas DataFrame, with range indexing and a date column?
I have a sample DataFrame as such:
df = pd.DataFrame(data=[('foo', datetime.date(2014, 10, 1)),
('foo', datetime.date(2014, 10, 2)),
('bar', ...
1
vote
3answers
37 views
Finding index of a pandas DataFrame value
I am trying to process some .csv data using pandas, and I am struggling with something that I am sure is a rookie move, but after spending a lot of time trying to make this work, I need your help.
...
1
vote
1answer
25 views
Understanding groupby in pandas
I'm looking to get the sum of some values in a dataframe after it has been grouped.
some sample data:
Race officeID CandidateId total_votes precinct
Mayor 10 705 ...
1
vote
0answers
29 views
How to access data from HDF5 with hierarchical keys?
I have created a store in HDF5 with hierarchical keys with the following structure
<class 'pandas.io.pytables.HDFStore'>
File path: path-analysis/data/store.h5
/attribution/attr_000000 ...
2
votes
1answer
18 views
Only one column after group by()
Am switch from R to Python for most of my data analysis needs and am running into the following issue. Could be the result of my conceptual understanding of groupby().
I have a Pandas data frame and ...
1
vote
1answer
23 views
Convert row to column header for Pandas DataFrame,
The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?
I want to do ...
0
votes
1answer
17 views
Conditionally grabbing column headings in pandas dataframe
I have a pandas DataFrame with many columns and indexed by probability. Below is code that can generate a sample df
import numpy as N
probs = N.arange(0, 1, .1)
data = N.random.random_integers(0, ...
0
votes
1answer
13 views
Python pandas split value in row evenly across all duplicated ID's
Kind of stuck on this so hopefully someone can help. Generally I have a dataframe like this
df = pd.DataFrame({
"id": [1,1,1,4,5,5,7],
"value": [100, 100, 100, 45, 3, 3, 42]
...
1
vote
1answer
44 views
Pandas aggregation ignoring NaN's
I aggregate my Pandas dataframe: data. Specifically, I want to get the average and sum amounts by tuples of [origin and type]. For averaging and summing I tried the numpy functions below:
import ...
0
votes
0answers
23 views
applying a string function to a pandas dataframe column
this seems somewhat basic but after going through stackoverflow I couldn't seem to take everything answered and solve my problem. so i'm working on my text processing skills. I put car reviews in a ...
0
votes
0answers
16 views
Creating and graphing Hierarchical Trees in Python with pandas
So I have hierarchical information stored within a pandas DataFrame and I would like to construct and visualize a hierarchical tree based on this information.
For example, a row in my DataFrame has ...
1
vote
1answer
32 views
Pandas Groupby Apply Function to Level
I have a pandas groupby object that has the following structure:
SETTLE_DATE DATE
2014-09-23 2014-09-19 0.000091
2014-09-22 0.000163
2014-10-01 2014-09-29 -0.000002
...
2
votes
1answer
23 views
Groupwise downsampling and plotting of pd.DataFrame
I am trying to downsample grouped data to daily averages, calculated for each group, and plot the resulting time series in a single plot.
My starting point is the following pd.DataFrame:
value ...
0
votes
0answers
22 views
merging 2 Pandas Dataframes reduces the dataset length
I want to merge the following pandas Datadrame: first_df and second_df
>>> len(first_df)
813
>>> len(second_df)
813
To merge it I'm using:
third_df = pd.merge(first_df, ...
0
votes
2answers
36 views
Pandas: create timestamp from 3 columns: Month, Day, Hour
I'm using Python 2.7, panda 0.14.1-2, numpy 1.8.1-1. I have to use Python 2.7 because I'm coupling it with something that doesn't work on Python 3
I'm trying to analyze a csv files that outputs ...
0
votes
1answer
22 views
round a single column in pandas
Is there a way to round a single column in pandas without affecting the rest of the dataframe?
df:
item value1 value2
0 a 1.12 1.3
1 a 1.50 2.5
2 a 0.10 ...
0
votes
1answer
13 views
How to Stack Data Frames on top of one another (Pandas,Python3)
Lets say i Have 3 Pandas DF
DF1
Words Score
The Man 2
The Girl 4
Df2
Words2 Score2
The Boy 6
The Mother 7
Df3
Words3 Score3
The Son 3
The ...