1
vote
2answers
10 views

numpy array elementwise multiply a panda timeseries

I have these two data structures: a = np.array([1,2,3]) ts = pd.TimeSeries([1,2,3]) What I want to get at the end is: 1 2 3 2 4 6 3 6 9
0
votes
0answers
7 views

specific sort pandas DataFrame index

I'm trying to sort the index of a pandas DataFrame in a specific way but I'm facing difficulties to get what I want. I have the following df: index = ...
0
votes
1answer
13 views

Using a Pandas dataframe index as values for x-axis in matplotlib plot

I have time series in a Pandas dateframe with a number of columns which I'd like to plot. Is there a way to set the x-axis to always use the index from a dateframe? When I use the .plot() method from ...
0
votes
0answers
15 views

How can I write a csv file with multiple header lines with pandas to_csv()?

Consider a data frame with a date column as an index and three columns x, y and z with some observations. I want to write the contents of this data frame to a .csv file. I know I can use df.to_csv for ...
1
vote
1answer
15 views

Iterating over nested Ordered Dictionaries in Python, then Saving keys (or values) to Pandas dataframe

I am trying to iterate over nested Ordered Dictionaries in Python. I know that I can do something like this: food = OrderedDict([('Fruits', OrderedDict([('Apple', 50), ('Banana', 100), ('Pear', ...
3
votes
2answers
24 views

pandas: extract or split char from number string

I have a dataframe selected from a sql table that looks like this id shares_float 0 1 621.76M 1 2 329.51M in other word, [(1, '621.76M'), (2, '329.51M')] I want to split the ...
0
votes
0answers
8 views

dataframe.to_sql not working properly in pandas

I am trying to read data from a csv file and add it to sqlite3 with the following code: import sqlite3 import numpy as np import cStringIO import pandas.io.sql as psql from dateutil import parser ...
0
votes
0answers
20 views

Python Pandas Crosstabs

New to Pandas and fairly new to Python. I’m trying to produce a crosstabs report from the following data – just showing a few rows. PALLET_ID AISLE NUM_PALLETS 7197033 AH 1 7197035 AC 1 7197035 AC ...
-1
votes
0answers
21 views

Python Pandas: easy way of editing data elements of a dataframe? [on hold]

I'm experimenting with Pandas as an alternative to spreadsheets. What is the easiest way of updating lots of different non-contiguous data elements in a dataframe? (In a spreadsheet, I'd just go from ...
1
vote
0answers
20 views

Pandas read_csv skipping a row

I've got a CSV file something like this: " ";D1;D2;D3;D4; " ";V1;V2;V3;V4;" "; 2014-03-03 00:00:00.0;397989;18.7;18.7;18.7; 2014-03-03 00:30:00.0;398042;18.7;18.7;18.6; 2014-03-03 ...
1
vote
1answer
18 views

Map string values in a Pandas Dataframe with integers

In Pandas DataFrame how to map strings in one column with integers. I have around 500 strings in the DataFrame and need to replace them with integers starting with '1'. Sample DataFrame. ...
0
votes
1answer
13 views

'bz2 is module not available' when installing Pandas with pip in python virtual environment

I am going through this post Numpy, Scipy, and Pandas - Oh My!, installing some python packages, but got stuck at the line for installing Pandas: pip install -e ...
1
vote
1answer
12 views

getting the unique values of every column in a pandas dataframe - to help me create smaller more manageable dataframes to perform metrics on

I started off wanting to turn a column from a pandas dataframe into a list, and then get the unique values, with the aim of iterating over those unique values in a for loop, and creating a few smaller ...
-1
votes
1answer
26 views

summing two columns in a pandas dataframe

when I use this syntax it creates a series rather than adding a column to my new dataframe (sum). Please help. My code: sum = data['variance'] = data.budget + data.actual My Data (in dataframe ...
0
votes
1answer
17 views

issue plotting too many lines on curve fit with matplotlib

not sure what I'm doing wrong, but when I try and implement the polyfit to scatterplot data (year, rating) it keeps plotting a whole bunch of lines rather than one single line. It looks like this: ...
0
votes
0answers
14 views

Reduce/Flatten MultiIndex

I have a multiindexed dataframe with measurements and errors x y mean std mean std time 0 190.791926 NaN ...
-1
votes
1answer
22 views

get list from pandas dataframe column

I have an excel document which looks like this.. cluster load_date budget actual fixed_price A 1/1/2014 1000 4000 Y A 2/1/2014 12000 10000 Y A 3/1/2014 36000 2000 Y ...
1
vote
0answers
20 views

Copy Warning in Pandas Series

I have a column which is in datetime format and I want to change it to be date format. db['Date'] = db['Date'].apply(lambda x: x.date()) And then I got a warning: __main__:1: ...
-1
votes
0answers
21 views

Graph in different colour according to classification

I have a CSV. Each row corresponds to a different item. Each item is a class of either 0 or 1. I have a column in my CSV which represent the "category" of an item. I am trying to graph this in a ...
2
votes
1answer
28 views

percentile rank in pandas in groups

I can't quite figure out how to write function to accomplish a grouped percentile. I have all teams from years 1985-2012 in a data frame; the first 10 are shown below: it's currently sorted by year. ...
1
vote
1answer
15 views

How do I turn this json object into a panda dataframe?

I have a csv file that I turn into a json so that I can turn it into a panda data-frame. Here is my code, def create_png(): f = open('sticks.csv', 'r') reader = csv.DictReader( f, ...
0
votes
1answer
23 views

matplotlib subplots with variable width/data limits

I am using pandas and matplotlib to plot data from an experiment involving 5 sessions. I would like the data for each session to be displayed in a separate panel; I am attempting to use subplotting to ...
1
vote
1answer
13 views

Building Pandas tables of where and for what values maxima are found

I have pandas data with the structure reported by info() <class 'pandas.core.frame.DataFrame'> Int64Index: 7058 entries, 0 to 7057 Data columns (total 16 columns): ID 7058 non-null ...
2
votes
0answers
25 views

handling nested json in pandas

I am trying to handle nested json with pnadas using read_json, but I am getting repeated entries like shown here: contributors_enabled 2013-11-30 20:48:42 created_at ...
1
vote
1answer
33 views

compare if a value exists in a csv file

I have a 200 MB CSV, file and a 4 GB json file in compressed format(300 MB when in compressed form). now I need to check if a particular field in json has a value which matches with any of the values ...
2
votes
2answers
36 views

Pandas; tricky pivot table

I have a pandas dataframe that I need to reshape/pivot. How to do it just seems beyond me at the moment. The dataframe looks like this: Ref Statistic Val1 Val2 Val3 Val4 0 Mean 0 1 2 ...
1
vote
2answers
28 views

Pandas: Get values from column that appear more than X times

I have a data frame in pandas and would like to get all the values of a certain column that appear more than X times. I know this should be easy but somehow I am not getting anywhere with my current ...
2
votes
1answer
15 views

Time format when using pandas.to_csv()

I have a out put from a Pandas DataFrame as following. id value exit enter time_diff 0 1 a 2012-11-27 10:41:20 2012-11-27 10:39:00 00:02:20 1 2 a ...
2
votes
2answers
26 views

Python pandas dataframe: Find last occurrence of value less-than-or-equal-to current row

I have 2 pandas dataframes: df1: ksat muacres SAND SILT CLAY 0 5326 0 0 0 0.1 4346 0 0 0 0.4 4146 0 0 0 0.8 3476 0 0 ...
1
vote
2answers
30 views

Concatenate multiples similar CSV files into one big datafame

I have one directory where there are only the CSV files I want to use. I want to concatenate all this CSV files and create a bigger one. I've tried one code but it didn't work. import os import ...
1
vote
1answer
18 views

Pandas DataFrame groupby two columns and get first and last

I have a DataFrame Like following. df = pd.DataFrame({'id' : [1,1,2,3,2], 'value' : ["a","b","a","a","c"], 'Time' : ['6/Nov/2012 23:59:59 -0600','6/Nov/2012 00:00:05 ...
1
vote
2answers
22 views

Appending a column in a pandas DataFrame based on a different DataFrame

I have these two DataFrames: df1 A B 0 8 x 1 3 y 2 1 x 3 2 z 4 7 y and df2 A B 0 x hey 1 y there 2 z sir I am ...
-2
votes
1answer
35 views

Filter CSV File with Pandas/Python

I have a CSV file, I wanted to filter it where I keep just rows where I have values in row "d" bigger then 0. File: index value d 0 975 25.35 5 1 976 26.28 4 2 977 26.24 1 ...
1
vote
1answer
31 views

Persistence problems when using iterrows()

As I believe someone also reported in this thread, filling in a dataframe using iterrows() can result in persistence problems. E.g. something as simple as: my_dataframe = pd.DataFrame(np.NaN, index ...
1
vote
1answer
26 views

Getting proportion of each of one variable that is True for another in 'pandas'

I have a dataframe in pandas that includes a column 'A' and a boolean-valued column 'B' and would like to find the values of 'A' for which at least a certain number, n, of the rows have True for 'B'. ...
0
votes
2answers
21 views

Merge two identical CSV from the same directory - Python

I have two data frames with the same structure in a CSV. I want to read both CSV and merge them to create one bigger data frame. In the directory there are only the two data frames. The first CSV is ...
1
vote
1answer
19 views

Matplotlib Bar Chart Choose Color if Value is Postive vs Value is Negative

I have a Pandas DataFrame with positive and negative values as a bar chart. I want to plot the positive colors 'green' and the negative values 'red'(very original...lol). I'm not sure how to pass ...
1
vote
0answers
30 views

Strange behavior when joining dataframes by index in pandas. Can someone explain what is happening?

I'm trying to use a Dataframe to store data before I need to output it to a file and found some strange behavior when I try and add data to the Dataframe. Can someone please look at the code below ...
1
vote
1answer
23 views

Counting occurrence of a word in a column of a tsv file using python

Question from a python beginner! I have a tsv file looking like this: WHI5 YOR083W CDC28 YBR160W physical interactions 19823668 WHI5 YOR083W CDC28 YBR160W physical interactions 21658602 ...
1
vote
2answers
28 views

Using Python parser to sniff delimiter Spammed to STDOUT

When using pandas.read_csv setting sep = None for automatic delimiter detection, the message Using Python parser to sniff delimiter is printed to STDOUT. My code calls this function often so this ...
2
votes
2answers
23 views

Add column to a specific CSV row using Python pandas

I want to merge rows in csv files by matching the id with a given dictionary. I have a dictionary: l= {2.80215: [376570], 0.79577: [378053], 22667183: [269499]} I have a csv file. A ...
2
votes
1answer
27 views

Creating a matrix of joint number of hits from two columns using numpy/pandas

I have 2 large columns of data (some 1.5million values). They are structured as : col1 = [2,2,1,4,5,4,3,4,4,4,5,2,3,1,1 ..] etc., col2 = [1,1,8,8,3,5,6,7,2,3,10.........] etc., I want to ...
-3
votes
2answers
26 views

Pandas - cumsum by month?

I have a dataframe that looks like this: Date n 2014-02-27 4 2014-02-28 5 2014-03-01 1 2014-03-02 6 2014-03-03 7 I'm trying to get to one that looks like this Date ...
2
votes
2answers
35 views

words frequency using pandas and matplotlib

How can I plot word frequency histogram (for author column)using pandas and matplotlib from a csv file? My csv is like: id, author, title, language Sometimes I have more than one authors in author ...
3
votes
1answer
61 views

Using pandas for loading huge json files

I am having a 500+ huge json files, each of size 400 MB, when in compressed format(3 Gigs, when uncompressed). I am using standard json library in python 2.7 to process the data, and the time taking ...
1
vote
1answer
40 views

Appending dict to a dataframe

new to python. I am reading rows from the origin dataframe and trying to append it to target dataframe. here is my program main code to read each row of rawdata. for i,row in raw_data.iterrows(): ...
1
vote
1answer
31 views

Widening Pandas Data Frame, Similar to Pivot or Stack/Unstack

My problem is probably best explained with an example: What I have: ID0,ID1,Time,Data0,Data1 1 1 10 'A' 93 1 2 10 'A' 55 1 1 12 'A' 88 1 2 12 'B' 66 2 3 102 ...
0
votes
1answer
28 views

Modify DataFrame passed as argument

I have a timeseries DataFrame (df) to which i need to add an column, and then pass this df to a function that modifies the content of a time slice of a single column. My idea is as follows: rng = ...
1
vote
1answer
21 views

Equivalent of Series.map for DataFrame?

Using Series.map with a Series argument, I can take the elements of a Series and use them as indices into another Series. I want to do the same thing with some columns of a DataFrame, using each row ...
0
votes
1answer
27 views

Pb converting a list of pandas.Series into a numpy array of pandas.Series

I would like to convert a list of pandas.Series into a numpy array of pandas.Series. But when I call the array constructor, it also converting my Series. >>> l = ...

15 30 50 per page