data-engineering

A clear and concise description of what the bug is.

Expected results

The cursor will move to the left

Actual results

New Tab is created

Screenshots

If applicable, add screenshots to help explain your problem.

How to reproduce the bug

Go to

Created by Dylan Hughes via monday.com integration. 🎉

Describe the bug
Using a data source with umlauts in the column names leads to the Jupyter Notebook with which the suite can be edited throw an error on startup. The Notebook then doesn't load.
This might be a Jupyter Notebook bug, not sure?!

To Reproduce
Steps to reproduce the behavior:

Initialize a suite with this xlsx file as a data source (nothing fancy: Just two columns, the

Problem description

When I use the function of concatenating multiple columns, I find that it does not handle null values as expected.

This is the current output

df.concatenate_columns(["cat_1","cat_2","cat_3"],"cat",sep=",")

	cat_1	cat_2

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.

`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)

@classmethod
def create_testing_pyspark_session(cls):
    return Sp

In SubjectAreaRESTServicesInstance, it hard codes the default page size as 0, this is not correct

public static final String PAGE_SIZE_DEFAULT_VALUE = "0";
it should be changed to
public static final String PAGE_SIZE_DEFAULT_VALUE = "1000";

So it is consistent with OMAGServerConfig default
private static final int defaultMaxPageSize = 1000;

Pivot missing categories breaks FeatureSet/AggregatedFeatureSet

Summary

When defining a feature set, it's expected that pivot will have all categories and, as a consequence, the resulting Source dataframe will be suitable to be transformed. When a different behavior happens, FeatureSet and AggregatedFeatureSet breaks.

Feature related:

Age: legacy

data-engineering

Here are 550 public repositories matching this topic...

apache / incubator-superset

Expected results

Actual results

Screenshots

How to reproduce the bug

PrefectHQ / prefect

eugeneyan / applied-ml

great-expectations / great_expectations

adilkhash / Data-Engineering-HowTo

kantord / just-dashboard

awslabs / aws-data-wrangler

quiltdata / quilt

GoogleCloudPlatform / data-science-on-gcp

san089 / goodreads_etl_pipeline

ericmjl / pyjanitor

Problem description

This is the current output

AlexIoannides / pyspark-example-project

kevintpeng / Learn-Something-Every-Day

rich-iannone / pointblank

san089 / Udacity-Data-Engineering-Projects

Cascading / cascading

dataform-co / dataform

odpi / egeria

alexklibisz / elastik-nearest-neighbors

gunnarmorling / awesome-opensource-data-engineering

sderosiaux / every-single-day-i-tldr

aiguofer / gspread-pandas

LGE-ARC-AdvancedAI / auptimizer

eBay / accelerator

Leverege / gcp-data-engineer-exam

Flor91 / Data-engineering-nanodegree

d6t / d6t-python

swoop-inc / spark-alchemy

Minyus / pipelinex

quintoandar / butterfree

Pivot missing categories breaks FeatureSet/AggregatedFeatureSet

Summary

Improve this page

Add this topic to your repo