streaming-data

The list is unsorted at all

Problem description

I am getting the following error when reading a file from an S3 bucket:

Invalid bucket name "xxxx:yyyy@bucket": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]acce

Implement progressive versions of hopping and tumbling windows:

Both window macro methods should get added versions that take an additional parameter
The parameter should represent the time interval that should be used to produce intermediate results of aggregations
The parameter should be a clean divisor of the tumble size for tumbling windows and the hop size for hopping windows

Hello, I have a CSV file that has 9 features and 9 expected targets, and I want to test 2 regression models on this data (that should be generated as a stream).

When I test the MultiTargetRegressionHoeffdingTree and RegressorChain on this data I get a bad R2-score, but when I tried normalizing my data with scikit-learn I get a pretty good R2-score. The problem is that I should not use sci

CASE doesn't work well with null. This works as expected and prints 'works':

WITH 2 AS name
RETURN CASE name
    WHEN 2 THEN 'works'
    WHEN null THEN "doesn't work"
    ELSE 'something went wrong'
END

If we swap the first case from 2 to 3. It should print 'something went wrong', but instead it prints "doesn't work":

WITH 2 AS name
RETURN CASE name
    WHEN 3 THEN 'works'

It is currently hard for users to track which versions of dependencies they are getting and which versions they should use when adding extra dependencies to their projects. This results in code like this in our own example projects:

libraryDependencies ++= Seq(
        "com.lightbend.akka"     %% "akka-stream-alpakka-file"  % "1.1.2",
        "com.typesafe.akka"      %% "akka-http-spray-js

Is your feature request related to a problem? Please describe.
Today the user needs to deploy udf jars and reference data csvs manually to the blob location

Describe the solution you'd like
Enable the user to choose a file on a local disk which the web portal will then upload to the right location

I totally forgot we have machinery for this in the "muti program" tests. We can likely reuse this on the "external ctrl-c" tests as well!

streaming-data

Here are 299 public repositories matching this topic...

newTendermint / awesome-bigdata

johnkerl / miller

benthosdev / benthos

online-ml / river

provectus / kafka-ui

RaRe-Technologies / smart_open

Problem description

pravega / pravega

microsoft / Trill

python-streamz / streamz

reugn / go-streams

zpl-c / zpl

joshday / OnlineStats.jl

scikit-multiflow / scikit-multiflow

Stratio / sparta

infoslack / awesome-kafka

swimos / swim

kLabUM / rrcf

memgraph / memgraph

radiantly / you-cant-download-this-image

lightbend / cloudflow

microsoft / data-accelerator

bbejeck / kafka-streams-in-action

guillermo-navas-palencia / optbinning

Chulong-Li / Real-time-Sentiment-Tracking-on-Twitter-for-Brand-Improvement-and-Trend-Recognition

ast-al / rangeless

goodboy / tractor

selimfirat / pysad

maraisr / meros

Western-OC2-Lab / PWPAE-Concept-Drift-Detection-and-Adaptation

GridProtectionAlliance / gsf

Improve this page

Add this topic to your repo