Tagged Questions
1
vote
2answers
10 views
numpy array elementwise multiply a panda timeseries
I have these two data structures:
a = np.array([1,2,3])
ts = pd.TimeSeries([1,2,3])
What I want to get at the end is:
1 2 3
2 4 6
3 6 9
0
votes
0answers
7 views
specific sort pandas DataFrame index
I'm trying to sort the index of a pandas DataFrame in a specific way but I'm facing difficulties to get what I want.
I have the following df:
index = ...
0
votes
1answer
13 views
Using a Pandas dataframe index as values for x-axis in matplotlib plot
I have time series in a Pandas dateframe with a number of columns which I'd like to plot. Is there a way to set the x-axis to always use the index from a dateframe?
When I use the .plot() method from ...
0
votes
0answers
15 views
How can I write a csv file with multiple header lines with pandas to_csv()?
Consider a data frame with a date column as an index and three columns x, y and z with some observations. I want to write the contents of this data frame to a .csv file. I know I can use df.to_csv for ...
1
vote
1answer
15 views
Iterating over nested Ordered Dictionaries in Python, then Saving keys (or values) to Pandas dataframe
I am trying to iterate over nested Ordered Dictionaries in Python. I know that I can do something like this:
food = OrderedDict([('Fruits', OrderedDict([('Apple', 50), ('Banana', 100), ('Pear', ...
3
votes
2answers
24 views
pandas: extract or split char from number string
I have a dataframe selected from a sql table that looks like this
id shares_float
0 1 621.76M
1 2 329.51M
in other word,
[(1, '621.76M'), (2, '329.51M')]
I want to split the ...
0
votes
0answers
8 views
dataframe.to_sql not working properly in pandas
I am trying to read data from a csv file and add it to sqlite3 with the following code:
import sqlite3
import numpy as np
import cStringIO
import pandas.io.sql as psql
from dateutil import parser
...
0
votes
0answers
20 views
Python Pandas Crosstabs
New to Pandas and fairly new to Python. I’m trying to produce a crosstabs report from the following data – just showing a few rows.
PALLET_ID AISLE NUM_PALLETS
7197033 AH 1
7197035 AC 1
7197035 AC ...
-1
votes
0answers
21 views
Python Pandas: easy way of editing data elements of a dataframe? [on hold]
I'm experimenting with Pandas as an alternative to spreadsheets.
What is the easiest way of updating lots of different non-contiguous data elements in a dataframe? (In a spreadsheet, I'd just go from ...
1
vote
0answers
20 views
Pandas read_csv skipping a row
I've got a CSV file something like this:
" ";D1;D2;D3;D4;
" ";V1;V2;V3;V4;" ";
2014-03-03 00:00:00.0;397989;18.7;18.7;18.7;
2014-03-03 00:30:00.0;398042;18.7;18.7;18.6;
2014-03-03 ...
1
vote
1answer
18 views
Map string values in a Pandas Dataframe with integers
In Pandas DataFrame how to map strings in one column with integers. I have around 500 strings in the DataFrame and need to replace them with integers starting with '1'.
Sample DataFrame.
...
0
votes
1answer
13 views
'bz2 is module not available' when installing Pandas with pip in python virtual environment
I am going through this post Numpy, Scipy, and Pandas - Oh My!, installing some python packages, but got stuck at the line for installing Pandas:
pip install -e ...
1
vote
1answer
12 views
getting the unique values of every column in a pandas dataframe - to help me create smaller more manageable dataframes to perform metrics on
I started off wanting to turn a column from a pandas dataframe into a list, and then get the unique values, with the aim of iterating over those unique values in a for loop, and creating a few smaller ...
-1
votes
1answer
26 views
summing two columns in a pandas dataframe
when I use this syntax it creates a series rather than adding a column to my new dataframe (sum). Please help.
My code:
sum = data['variance'] = data.budget + data.actual
My Data (in dataframe ...
0
votes
1answer
17 views
issue plotting too many lines on curve fit with matplotlib
not sure what I'm doing wrong, but when I try and implement the polyfit to scatterplot data (year, rating) it keeps plotting a whole bunch of lines rather than one single line. It looks like this:
...
0
votes
0answers
14 views
Reduce/Flatten MultiIndex
I have a multiindexed dataframe with measurements and errors
x y
mean std mean std
time
0 190.791926 NaN ...
-1
votes
1answer
22 views
get list from pandas dataframe column
I have an excel document which looks like this..
cluster load_date budget actual fixed_price
A 1/1/2014 1000 4000 Y
A 2/1/2014 12000 10000 Y
A 3/1/2014 36000 2000 Y
...
1
vote
0answers
20 views
Copy Warning in Pandas Series
I have a column which is in datetime format and I want to change it to be date format.
db['Date'] = db['Date'].apply(lambda x: x.date())
And then I got a warning:
__main__:1: ...
-1
votes
0answers
21 views
Graph in different colour according to classification
I have a CSV. Each row corresponds to a different item. Each item is a class of either 0 or 1.
I have a column in my CSV which represent the "category" of an item. I am trying to graph this in a ...
2
votes
1answer
28 views
percentile rank in pandas in groups
I can't quite figure out how to write function to accomplish a grouped percentile. I have all teams from years 1985-2012 in a data frame; the first 10 are shown below: it's currently sorted by year. ...
1
vote
1answer
15 views
How do I turn this json object into a panda dataframe?
I have a csv file that I turn into a json so that I can turn it into a panda data-frame.
Here is my code,
def create_png():
f = open('sticks.csv', 'r')
reader = csv.DictReader( f, ...
0
votes
1answer
23 views
matplotlib subplots with variable width/data limits
I am using pandas and matplotlib to plot data from an experiment involving 5 sessions. I would like the data for each session to be displayed in a separate panel; I am attempting to use subplotting to ...
1
vote
1answer
13 views
Building Pandas tables of where and for what values maxima are found
I have pandas data with the structure reported by info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 7058 entries, 0 to 7057
Data columns (total 16 columns):
ID 7058 non-null ...
2
votes
0answers
25 views
handling nested json in pandas
I am trying to handle nested json with pnadas using read_json, but I am getting repeated entries like shown here:
contributors_enabled 2013-11-30 20:48:42
created_at ...
1
vote
1answer
33 views
compare if a value exists in a csv file
I have a 200 MB CSV, file and a 4 GB json file in compressed format(300 MB when in compressed form). now I need to check if a particular field in json has a value which matches with any of the values ...
2
votes
2answers
36 views
Pandas; tricky pivot table
I have a pandas dataframe that I need to reshape/pivot. How to do it just seems beyond me at the moment. The dataframe looks like this:
Ref Statistic Val1 Val2 Val3 Val4
0 Mean 0 1 2 ...
1
vote
2answers
28 views
Pandas: Get values from column that appear more than X times
I have a data frame in pandas and would like to get all the values of a certain column that appear more than X times. I know this should be easy but somehow I am not getting anywhere with my current ...
2
votes
1answer
15 views
Time format when using pandas.to_csv()
I have a out put from a Pandas DataFrame as following.
id value exit enter time_diff
0 1 a 2012-11-27 10:41:20 2012-11-27 10:39:00 00:02:20
1 2 a ...
2
votes
2answers
26 views
Python pandas dataframe: Find last occurrence of value less-than-or-equal-to current row
I have 2 pandas dataframes:
df1:
ksat muacres SAND SILT CLAY
0 5326 0 0 0
0.1 4346 0 0 0
0.4 4146 0 0 0
0.8 3476 0 0 ...
1
vote
2answers
30 views
Concatenate multiples similar CSV files into one big datafame
I have one directory where there are only the CSV files I want to use. I want to concatenate all this CSV files and create a bigger one. I've tried one code but it didn't work.
import os
import ...
1
vote
1answer
18 views
Pandas DataFrame groupby two columns and get first and last
I have a DataFrame Like following.
df = pd.DataFrame({'id' : [1,1,2,3,2],
'value' : ["a","b","a","a","c"], 'Time' : ['6/Nov/2012 23:59:59 -0600','6/Nov/2012 00:00:05 ...
1
vote
2answers
22 views
Appending a column in a pandas DataFrame based on a different DataFrame
I have these two DataFrames:
df1
A B
0 8 x
1 3 y
2 1 x
3 2 z
4 7 y
and
df2
A B
0 x hey
1 y there
2 z sir
I am ...
-2
votes
1answer
35 views
Filter CSV File with Pandas/Python
I have a CSV file, I wanted to filter it where I keep just rows where I have values in row "d" bigger then 0.
File:
index value d
0 975 25.35 5
1 976 26.28 4
2 977 26.24 1
...
1
vote
1answer
31 views
Persistence problems when using iterrows()
As I believe someone also reported in this thread, filling in a dataframe using iterrows() can result in persistence problems. E.g. something as simple as:
my_dataframe = pd.DataFrame(np.NaN, index ...
1
vote
1answer
26 views
Getting proportion of each of one variable that is True for another in 'pandas'
I have a dataframe in pandas that includes a column 'A' and a boolean-valued column 'B' and would like to find the values of 'A' for which at least a certain number, n, of the rows have True for 'B'.
...
0
votes
2answers
21 views
Merge two identical CSV from the same directory - Python
I have two data frames with the same structure in a CSV. I want to read both CSV and merge them to create one bigger data frame. In the directory there are only the two data frames.
The first CSV is ...
1
vote
1answer
19 views
Matplotlib Bar Chart Choose Color if Value is Postive vs Value is Negative
I have a Pandas DataFrame with positive and negative values as a bar chart. I want to plot the positive colors 'green' and the negative values 'red'(very original...lol). I'm not sure how to pass ...
1
vote
0answers
30 views
Strange behavior when joining dataframes by index in pandas. Can someone explain what is happening?
I'm trying to use a Dataframe to store data before I need to output it to a file and found some strange behavior when I try and add data to the Dataframe. Can someone please look at the code below ...
1
vote
1answer
23 views
Counting occurrence of a word in a column of a tsv file using python
Question from a python beginner! I have a tsv file looking like this:
WHI5 YOR083W CDC28 YBR160W physical interactions 19823668
WHI5 YOR083W CDC28 YBR160W physical interactions 21658602
...
1
vote
2answers
28 views
Using Python parser to sniff delimiter Spammed to STDOUT
When using pandas.read_csv setting sep = None for automatic delimiter detection, the message Using Python parser to sniff delimiter is printed to STDOUT. My code calls this function often so this ...
2
votes
2answers
23 views
Add column to a specific CSV row using Python pandas
I want to merge rows in csv files by matching the id with a given dictionary.
I have a dictionary:
l= {2.80215: [376570], 0.79577: [378053], 22667183: [269499]}
I have a csv file.
A ...
2
votes
1answer
27 views
Creating a matrix of joint number of hits from two columns using numpy/pandas
I have 2 large columns of data (some 1.5million values). They are structured as :
col1 = [2,2,1,4,5,4,3,4,4,4,5,2,3,1,1 ..] etc.,
col2 = [1,1,8,8,3,5,6,7,2,3,10.........] etc.,
I want to ...
-3
votes
2answers
26 views
Pandas - cumsum by month?
I have a dataframe that looks like this:
Date n
2014-02-27 4
2014-02-28 5
2014-03-01 1
2014-03-02 6
2014-03-03 7
I'm trying to get to one that looks like this
Date ...
2
votes
2answers
35 views
words frequency using pandas and matplotlib
How can I plot word frequency histogram (for author column)using pandas and matplotlib from a csv file? My csv is like: id, author, title, language
Sometimes I have more than one authors in author ...
3
votes
1answer
61 views
Using pandas for loading huge json files
I am having a 500+ huge json files, each of size 400 MB, when in compressed format(3 Gigs, when uncompressed). I am using standard json library in python 2.7 to process the data, and the time taking ...
1
vote
1answer
40 views
Appending dict to a dataframe
new to python.
I am reading rows from the origin dataframe and trying to append it to target dataframe.
here is my program
main code to read each row of rawdata.
for i,row in raw_data.iterrows():
...
1
vote
1answer
31 views
Widening Pandas Data Frame, Similar to Pivot or Stack/Unstack
My problem is probably best explained with an example:
What I have:
ID0,ID1,Time,Data0,Data1
1 1 10 'A' 93
1 2 10 'A' 55
1 1 12 'A' 88
1 2 12 'B' 66
2 3 102 ...
0
votes
1answer
28 views
Modify DataFrame passed as argument
I have a timeseries DataFrame (df) to which i need to add an column, and then pass this df to a function that modifies the content of a time slice of a single column.
My idea is as follows:
rng = ...
1
vote
1answer
21 views
Equivalent of Series.map for DataFrame?
Using Series.map with a Series argument, I can take the elements of a Series and use them as indices into another Series. I want to do the same thing with some columns of a DataFrame, using each row ...
0
votes
1answer
27 views
Pb converting a list of pandas.Series into a numpy array of pandas.Series
I would like to convert a list of pandas.Series into a numpy array of pandas.Series. But when I call the array constructor, it also converting my Series.
>>> l = ...