-
Updated
Jul 14, 2021 - C
bigdata
Here are 1,486 public repositories matching this topic...
-
Updated
Jun 16, 2021
-
Updated
Jul 14, 2021 - Java
-
Updated
Jul 14, 2021 - Java
Dear all,
Since the tools were updated in github action CI machines, all existing PRs blocked due to CI failed. We have fixed it in volcano-sh/volcano#1595. You can rebase from master branch and submit PRs again. THX.
-
Updated
Jul 9, 2021 - Java
This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features
Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.
- Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p
-
Updated
Jun 23, 2021 - Java
-
Updated
Jul 9, 2021 - C++
-
Updated
Apr 7, 2021 - Jupyter Notebook
-
Updated
Jul 14, 2021 - JavaScript
-
Updated
Jul 14, 2021 - Jupyter Notebook
-
Updated
Jul 1, 2021 - Python
-
Updated
Jan 29, 2021 - C#
-
Updated
Apr 6, 2021 - Go
-
Updated
Jun 6, 2021 - Go
- empty
- notEmpty
- length
- lengthUTF8
- char_length, CHAR_LENGTH
- character_length, CHARACTER_LENGTH
- lower, lcase
- upper, ucase
- lowerUTF8
- upperUTF8
- isValidUTF8
- toValidUTF8
- CopytoValidUTF8(input_string)
- repeat
- reverse
- reverseUTF8
- format(pattern, s0, s1, …)
- concat
- concatA
-
Updated
Jun 8, 2021 - Python
-
Updated
Jun 12, 2021 - Jupyter Notebook
-
Updated
Jul 8, 2021 - Scala
-
Updated
Mar 17, 2021
-
Updated
Jun 28, 2021 - Go
Improve this page
Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."
Hello,
Considering your amazing efficiency on pandas, numpy, and more, it would seem to make sense for your module to work with even bigger data, such as Audio (for example .mp3 and .wav). This is something that would help a lot considering the nature audio (ie. where one of the lowest and most common sampling rates is still 44,100 samples/sec). For a use case, I would consider vaex.open('Hu