data-engineering
Here are 716 public repositories matching this topic...
-
Updated
Mar 9, 2021
-
Updated
Feb 19, 2021
Description
I have setup a custom, remote prefect server.
However, when registering a flow, only localhost is displayed in the Flow URL :
$ prefect register flow --file ./myflow.py -p sandbox
Result check: OK
Flow URL: http://localhost:8080/default/flow/9235a237-f6bc-41c7-89bc-132db233b49e
└── ID: a09a47b0-1292-412f-bd70-89c8bf4dcf1e
Describe the bug
When trying to run scaffolding (profiling) command, it fails because of commas in columns.
To Reproduce
Steps to reproduce the behavior:
- Run
great_expectations suite scaffold scaffold-nameon datasource where commas are in column - Bug
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 5323 saw 2
Expected behavior
D
-
Updated
Mar 10, 2021 - Go
-
Updated
Jan 13, 2021
-
Updated
Mar 9, 2021 - Python
-
Updated
Mar 8, 2021 - JavaScript
-
Updated
Mar 11, 2021 - Jupyter Notebook
-
Updated
Mar 8, 2021 - Jupyter Notebook
Enable delete repository action from the UI
-
Updated
Mar 9, 2020 - Python
Problem description
When I use the function of concatenating multiple columns, I find that it does not handle null values as expected.
This is the current output
df.concatenate_columns(["cat_1","cat_2","cat_3"],"cat",sep=",")| cat_1 | cat_2 |
|---|
if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.
`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)
@classmethod
def create_testing_pyspark_session(cls):
return Sp
-
Updated
Mar 9, 2021 - R
-
Updated
Mar 4, 2021
-
Updated
Mar 5, 2020 - Python
-
Updated
Feb 16, 2021 - Ruby
-
Updated
Feb 13, 2021
-
Updated
Feb 7, 2021 - CSS
-
Updated
Mar 9, 2021 - TypeScript
Egeria's open metadata labs use python notebooks to drive sequences of REST API calls to Egeria's runtime platform called the OMAG Server Platform. There is one function called printAssetUniverse that needs work. This function is designed to provide a data scientist with detailed information about an Asset (such as a file or a database). This includes name, description, its location, content,
-
Updated
Nov 29, 2018 - Java
-
Updated
Feb 22, 2021
-
Updated
Apr 20, 2020 - Python
-
Updated
Mar 7, 2021
-
Updated
Feb 23, 2021 - Python
sqlite does not provide helpful error messages when the parent folder does not exist
-
Updated
Mar 11, 2021 - Python
Improve this page
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."
Screenshot
Description
Menu items in superset show a