dataframe

Describe the problem

We should test on larger datasets that are commonly used in

Hi,
I am trying to load a CSV with no header using

df = vaex.open('data/star0000-1.csv',sep=",", header=None, error_bad_lines=False)

but I get

could not convert column 0, error: TypeError('getattr(): attribute name must be string'), will try to convert it to string
Giving up column 0, error: TypeError('getattr(): attribute name must be string')
could not convert column

Hi again,
a second issue I ran into is related to the userguide:
The example for Grouping on calculated columns regarding binning doesn't compile in v0.37.3 and apart from that doesn't lead to a reasonable result, as far as I see.

Compilation isn't possible, as bin() returns a DoubleColumn (even if c

Implement Series.reindex.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.reindex.html

Support error function and fresnel integrals in https://docs.scipy.org/doc/scipy/reference/special.html#error-function-and-fresnel-integrals, those are not universal functions may not need to be supported.

The documentation file appears to have been generated with no space between the hashes and the header text. This is causing the headers to not display correctly, and is difficult to read. See below for an example of with and without the space:

##

Mobius API Documentation

###Microsoft.Spark.CSharp.Core.Accumulator</

Hi, would it be possible to make the user warnings display only when using pipes that actually depend on these imports? Or at least display them in a way that allows filtering out (with logging package perhaps)?

It's just a minor flaw on otherwise great package. Awesome work!

janitor.biology could do with a to_fasta function, I think. The intent here would be to conveniently export a dataframe of sequences as a FASTA file, using one column as the fasta header.

strawman implementation below:

import pandas_flavor as pf
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq
from Bio import SeqIO

@pf.register_dataframe_method
def to_fasta(d

@danielgwilson

Any plans to get this into DefinitelyTyped?

Originally posted by @danielgwilson in Gmousse/dataframe-js#43 (comment)

*fix column header issues in preview
*handle arbitrary whitespace

Hello,

I haven't tested append() yet, and I was wondering if duplicates are removed when an append is managed.
I had a look in collection.py script and following pandas function are used:
combined = dd.concat([current.data, new]).drop_duplicates(keep="last")

After a look into pandas documentation, I understand that duplicate lines are removed, only the last occurence is kept.

In order to update https://bluenote10.github.io/NimData/nimdata.html I tried running build_docs.sh, but ran into the following Nim doc gen issues:

The following command is somewhat working, besides the missing dochack.js and with the `git.c

To improve spotting differences between datasets visually
(especially when there are many columns) it would be helpful if one could sort the categorical columns by the Jensen–Shannon divergence.

The code below tries to do so but it seems to distort the labels on the y-axis. Also, in case the jsd column contains missing values, those variables are deleted from the graph.

library(in

https://docs.python.org/3/library/sys.html#sys.getsizeof

dataframe

Here are 323 public repositories matching this topic...

modin-project / modin

Describe the problem

haifengl / smile

vaexio / vaex

jtablesaw / tablesaw

databricks / koalas

mars-project / mars

microsoft / Mobius

Mobius API Documentation

RedisLabs / spark-redis

andygrove / datafusion

Squarespace / datasheets

hosseinmoein / DataFrame

pdpipe / pdpipe

ballista-compute / ballista

ericmjl / pyjanitor

MrPowers / spark-daria

sngyai / Sequoia

Gmousse / dataframe-js

dmnfarrell / pandastable

firmai / pandasvault

lifeomic / sparkflow

rocketlaunchr / dataframe-go

ranaroussi / pystore

shramos / Awesome-Cybersecurity-Datasets

tobgu / qframe

DeepSpace2 / StyleFrame

nevi-me / rust-dataframe

bluenote10 / NimData

zavtech / morpheus-core

alastairrushworth / inspectdf

InvestmentSystems / static-frame

Improve this page

Add this topic to your repo