Skip to content
#

arrow

Here are 210 public repositories matching this topic...

Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.

  • Updated May 17, 2021
  • C++
arrow
tlitetrasci
tlitetrasci commented Apr 15, 2021

Issue Description

Arrow does not seem to perform validation on timestamps for unusual formats where the information conflicts.

The following code snippet runs just fine:

>>> arrow.get("2021-01-30 14:00:00 AM", "YYYY-MM-DD hh:mm:ss A")
<Arrow [2021-01-30T14:00:00+00:00]>

First of all, since hh is documented to go to a maximum value of 12, I expect arrow to raise an err

elstehle
elstehle commented May 5, 2021

Describe the bug
Integer columns that are enclosed in quotes are not correctly inferred as integer columns.

Steps/Code to reproduce bug

import cudf
import pandas as pd
from io import StringIO
from cudf.tests.utils import assert_eq

buffer = '"intcol","stringcol"\n"1","some string"\n"2","some other string"'
pd_df = pd.read_csv(StringIO(buffer))
cu_df = cudf.read_csv(String
NeroCorleone
NeroCorleone commented Aug 11, 2020

Problem description

Reading a dataset with eager's read functionality raises a ValueError when providing columns.

Example code (ideally copy-pastable)

import pandas as pd

from tempfile import TemporaryDirectory
from functools import partial
from storefact import get_store_from_url

from kartothek.io.eager import store_dataframes_as_dataset, read_dataset_as_data

Improve this page

Add a description, image, and links to the arrow topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the arrow topic, visit your repo's landing page and select "manage topics."

Learn more