Tagged Questions
0
votes
0answers
6 views
Pandas printing ALL dtypes
This seems like a very simple problem, however it's driving me round the bend. I'm sure it should be solved by RTFM, but I've looked at the options and I can see the one to fix it.
I just want to ...
0
votes
1answer
22 views
Call the name of a data frame rather than its content (Pandas)
I have a list of dataframes but when I call the content of the list it returns the content of the called dataframe.
List = [df1, df2, df3, ..., dfn]
List[1]
will ...
0
votes
2answers
32 views
Python Pandas daily average
I'm having problems getting the daily average in a Pandas database. I've checked here Calculating daily average from irregular time series using pandas and it doesn't help. csv files look like this:
...
0
votes
2answers
22 views
plotting selecting pandas dataframe data using seborn
I have a pandas dataframe (al_df) that contains the population of Alabama from a recent US census. I created a cummulative function that I plot using seaborn, resulting in this chart:
The code that ...
0
votes
1answer
42 views
Convert Float to String in Pandas
I am a little confused with datatype "object" in Pandas. What exactly is "object"?
I would like to change the variable "SpT" (see below) from object to String.
> df_cleaned.dtypes
Vmag ...
0
votes
1answer
20 views
creating a python pandas dataframe from a list of dictionaries when one entry of each dictionary is itself an array
I am trying to create a dataframe from a list of dictionaries. However, one entry of this list is itself an array (or could be a pandas.Series). I need to do grouping and averaging and I could not get ...
0
votes
2answers
37 views
Number of non-missing values in array? Len(x) excluding missing values?
Is there a function in python that allows me to count the number of non-missing values in an array?
My data:
df.wealth1[df.wealth < 25000] = df.wealth
df.wealth2[df.wealth <50000 & ...
0
votes
1answer
28 views
Pandas: Filling up empty dataframe
I have two questions. First, my filling up the data in the end triggers the following error. Second, since I am not too familiar with ``pandas'', this code is probably really untypical. If you have ...
2
votes
3answers
36 views
iterrows pandas get next rows value
I have a df in pandas
import pandas as pd
df = pd.DataFrame(['AA', 'BB', 'CC'], columns = ['value'])
I want to iterate over rows in df. For each row i want rows value and next rows value
Something ...
0
votes
0answers
23 views
Why frequencies in goProfiles are not the same with a sliced dataframe?
I've build a specific DataFrame with python pandas to compute ontology frequencies with goProfiles in bioconductor. I use the basicProfile function with option 'GOTermsFrame' but without the optional ...
0
votes
2answers
28 views
python and pandas - how to access a column using iterrows
wowee.....how to use iterrows with python and pandas? If I do a row iteration should I not be able to access a col with row['COL_NAME']?
Here are the col names:
print df
Int64Index: 152 entries, 0 ...
0
votes
1answer
32 views
new columns in index created inside for loop
I am trying to create something that measures up days and down days in stocks as measured by a higher or lower close than the day before. This is displayed as a 1 for an 'up day' and a -1 for a 'down ...
0
votes
0answers
21 views
Plotting error bars on barplots with multiple series in pandas
I can plot error bars on single series barplots like so:
import pandas as pd
df = pd.DataFrame([[4,6,1,3], [5,7,5,2]], columns = ['mean1', 'mean2', 'std1', 'std2'], index=['A', 'B'])
print(df)
...
1
vote
1answer
29 views
Python boxplot out of columns of different lengths
I have the following dataframe in Python (the actual dataframe is much bigger, just presenting a small sample):
A B C D E F
0 0.43 0.52 0.96 1.17 1.17 2.85
1 0.43 ...
0
votes
1answer
34 views
How to apply a function to a mixed type Pandas DataFrame in place?
This is how I apply a function to Pandas dataframe, it works in place and modifies the original data frame.
df = pd.DataFrame([[0,0,0],
[0,0,0],
[0,0,0]],
...
0
votes
1answer
24 views
Why am I getting strange behavior for creating several boolean series?
I have a DataFrame to which I am adding several boolean columns. For each column, I initialize it to False and then set some values to True. If I do this for one and then for another, the first gets ...
0
votes
1answer
25 views
Regression of a timeseries delta in pandas
Lets say I have a timeseries like this
A B
0 a b
1 c d
2 e f
3 g h
0,1,2,3 are times, a, c, e, g is one time series and b, d, f, h is another time series.
What i need is a ...
2
votes
1answer
36 views
Merging two dataframes in pandas without column names (new to pandas)
Short explanation:
If you have duplicate column names in your data, be sure to rename one column when you read the file.
If you have NaN etc in your data, remove those.
Then merge using correct ...
0
votes
0answers
34 views
Python basemap and proj.4 error for map plotting
I ran the Python code below that is an example of "Plotting Maps: Visualizing Haiti Earthquake Crisis Data" on a book, Python for Data Analysis. Page 242-246
The code is supposed to create a plot map ...
1
vote
1answer
16 views
How can I find the previous row in a DataFrame with a Timestamp index?
I want to zero the last NaN at the start of a DataFrame in Pandas. My DataFrame objects have timestamps in.
Example data
If I have this data:
In [228]: my_df
Out[228]:
blah
1990-01-01 ...
0
votes
1answer
10 views
Pandas DataFrame - convert months to datetime and iteratively select data from multiple columns for plotting
Say I have a pandas DataFrame with the format:
Month Thing1 Thing2 Tot
0 Jan-12 A Z 0.005880
1 Jan-12 A Z 0.024500
...
20 Jan-12 B Y 0.001533
21 ...
1
vote
2answers
44 views
Python pandas extracting hyphenated words from cells with phrases
I have a dataframe which contain phrases and I want to extract only compound words separated by a hyphen from the dataframe and place them in another dataframe.
df=pd.DataFrame({'Phrases': ['Trail 1 ...
0
votes
2answers
35 views
How do I get a Pandas TimeSeries for user sessions (using Pandas or Numpy)
I've got some data which has the login and logout times for a series of users.
Input:
Login Logout
User_1 10:25AM 6:01PM
User_2 8:58AM 5:12PM
User_3 9:23AM 1:35PM
...
0
votes
1answer
28 views
How can I get a pandas dataframe into CSV format with different formats per columns?
Issues for pandas' DataFrame text output functions:
to_csv() does not support the 'formatters' parameter of to_string(). I need different formats for each column.
to_string() does not support a ...
0
votes
1answer
23 views
create new variable according to existing variables using pandas
I have imported a csv file to pandas named "nhs_df" like following
I want to create a binary variable called "bmi_cate" so if bmi>25 bmi_cate=1 and otherwise
I write the following code but seems do ...
1
vote
2answers
68 views
Different result of code example on book: Python for Data Analysis
I have a question on a book "Python for Data Analysis" if anyone is interested in this book.
After running an example on page 244 (Plotting Maps: Visualizing Haiti Earthquake Crisis Data), my result ...
0
votes
1answer
19 views
Change pandas plot backgournd color
Is it possible to change the background in a pandas chart?
I would like to change the background from white and the line to orange but i cannot find any documentation to do that.
I am using the ...
0
votes
0answers
34 views
getting a warning i am not familiar with
I am using Python 2.7 and keep getting the below error. Please let me know if you need the full code but it is a bit long. Thank you for your help.
Warning (from warnings module):
File ...
2
votes
0answers
31 views
Is pandas.read_pickle() performance crippled in version 0.13?
After upgrading to pandas 0.13 from 0.12, I noticed that large pickle files were taking
much longer to load with pandas.read_pickle(). It seems like 0.13 no longer uses cPickle for reading - is that ...
1
vote
4answers
29 views
In pandas/python, reading array stored as string
I have a pandas dataframe where one of the columns has array of strings as each element.
So something like this.
col1 col2
0 120 ['abc', 'def']
1 130 ['ghi', 'klm']
Now when i store this to ...
0
votes
0answers
20 views
Iteratively add to pandas panel
I need to iteratively add dataframes to a panel since the data is too large to be held in memory.
A sample code reads like this:
import pandas as pd
import numpy as np
...
2
votes
2answers
69 views
Python numpy or pandas equivalent of the R function sweep()
What is numpy or pandas equivalent of the R function sweep()?
To elaborate: in R lets say we have a coefficient vector (say beta - numeric type) and an array (say data - 20x5 numeric type). I want to ...
2
votes
1answer
20 views
add multiple items to row using vectorized pandas function, not iterrows?
I have a rather large bioinformatics dataset that I'm processing using pandas. It looks something like this:
>>> df = pd.DataFrame([['a=1|b=4', 'a=2|b=3', 'a=1|b=1'],
[None]*3, ...
0
votes
0answers
19 views
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' when converting series object to unicode in pandas with utf-16
I have a utf-16 csv file that I'm trying to load into Pandas. By default the data comes in as an object datatype. I plan to do some modeling with the caption column so I'd like to convert the column ...
0
votes
2answers
19 views
Create a dummy variable by personid
I have a time series dataset with individuals and dates. I would like to create a dummy variable, "newpers", which assumes the value 1 for the the first time, chronologically, the id shows up in the ...
0
votes
0answers
17 views
Nongeneral columns with read_excel using pandas?
So I'm trying to read data from an excel file that has a date column, and a price column. The read_excel function in pandas doesn't seem to like that. I've been looking around for a way to convert ...
0
votes
2answers
39 views
Pandas DataFrame stored list as string: How to convert back to list?
I have an n-by-m Pandas DataFrame df defined as follows. (I know this is not the best way to do it. It makes sense for what I'm trying to do in my actual code, but that would be TMI for this post so ...
0
votes
1answer
32 views
Python set value multiindex Pandas
I'm a newbie to both Python and Pandas.
I am trying to construct a dataframe, and then later populate it with values.
I have constructed my dataframe like so :
from pandas import *
ageMin = 21
...
1
vote
0answers
26 views
pandas time_range does not start from start date
I recently started using pandas and I have some issue regarding date_range.
In [168]: pd.date_range("2013-07-01", "2013-10-03", freq='W').to_series()
Out[168]:
2013-07-07 2013-07-07
2013-07-14 ...
0
votes
0answers
35 views
Train OLS model with pandas concurrently
I want to train multiple OLS models (in fact, the number is about 500 in total) with pandas. Could I do this in a batched fashion?
Here is a simplified case, in which I want to train 5 models, say ...
0
votes
1answer
21 views
Shift entire column on a pandas dataframe
I want to shift to the right an entire column, fill with NAN the first column and drop the last column:
df0: A B C D
2013-12-31 10 6 6 5
2014-01-31 11 7 5 5
...
2
votes
1answer
17 views
How to use Rolling OLS Model interface in Pandas?
I want to calibrate a MovingOLS, but keep receiving error message
IndexError: index -1 is out of bounds for axis 0 with size 0
The data frame I used to train the MovingOLS is as below:
x1 y ...
2
votes
1answer
28 views
How to write DataFrame to postgres table?
There is DataFrame.to_sql method, but it works only for mysql, sqlite and oracle databases. I cant pass to this method postgres connection or sqlalchemy engine.
3
votes
1answer
39 views
Python Pandas updating dataframe and counting the number of cells updated
Lets say I am updating my dataframe with another dataframe (df2)
import pandas as pd
import numpy as np
df=pd.DataFrame({'axis1': ['Unix','Window','Apple','Linux'],
'A': ...
0
votes
0answers
17 views
Pandas ValueError when attempting to select data using a boolean operator
I'm trying to use this code to create a new Pandas DataFrame consisting of rows where both of my columns of interest have values.
sve2_hz = sve2_all[[(sve2_all[' Q l/s'].notnull()) & ...
0
votes
1answer
13 views
Reshape Dataframe columns to rows
I have a DataFrame that looks like this;
Year US China Russia
2007 NaN 45 12
2008 12 22 4
2009 12 NaN 41
I want it reshaped to look like this;
Year Country Value
...
0
votes
1answer
18 views
Pandas: Timing difference between Function and Apply to Series
I am trying to figure out why these two methods differ in %timeit results.
import pandas as pd
import numpy as np
d = pd.DataFrame(data={'S1' : [2,3,4,5,6,7,2], 'S2' : [4,5,2,3,4,6,8]}, \
...
0
votes
1answer
47 views
Why is it so much slower to export my data to .xlsx than to .xls or .csv?
I have a dataframe that I'm exporting to Excel, and people want it in .xlsx. I use to_excel, but when I change the extension from .xls to .xlsx, the exporting step takes about 9 seconds as opposed to ...
0
votes
0answers
31 views
cx_freeze fails to create exe with pandas library
Having problems creating exe using cx_freeze with a Pandas library. I have seen lots of others having issues with numPy but I was able to successfully bring in numPy. My big pain point has been ...
0
votes
1answer
37 views
How do I plot this DataFrame?
I've got a pandas.DataFrame that looks like this:
>>> print df
0 1 2 3 4 5 6 7 8 9 10 11 \
0 0.198 0.198 0.266 0.198 0.236 0.199 0.198 ...