streaming-data

Under the hood, Benthos csv input uses the standard encoding/csv packages's csv.Reader struct.

The current implementation of csv input doesn't allow setting the LazyQuotes field.

We have a use case where we need to set the LazyQuotes field in order to make things work correctly.

Describe the bug

After Restart failed Task in a connector with failed tasks the State is still FAILED but the Tasks Failed is 0 at the same time, even if switch the tab to Tasks and return in the connector.
In my case both indicators matched after return to Connectors overview (however it could be just coincidence and the state ne

Problem description

I am getting the following error when reading a file from an S3 bucket:

Invalid bucket name "xxxx:yyyy@bucket": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9]+:[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]acce

Implement progressive versions of hopping and tumbling windows:

Both window macro methods should get added versions that take an additional parameter
The parameter should represent the time interval that should be used to produce intermediate results of aggregations
The parameter should be a clean divisor of the tumble size for tumbling windows and the hop size for hopping windows

Hello, I have a CSV file that has 9 features and 9 expected targets, and I want to test 2 regression models on this data (that should be generated as a stream).

When I test the MultiTargetRegressionHoeffdingTree and RegressorChain on this data I get a bad R2-score, but when I tried normalizing my data with scikit-learn I get a pretty good R2-score. The problem is that I should not use sci

CASE doesn't work well with null. This works as expected and prints 'works':

WITH 2 AS name
RETURN CASE name
    WHEN 2 THEN 'works'
    WHEN null THEN "doesn't work"
    ELSE 'something went wrong'
END

If we swap the first case from 2 to 3. It should print 'something went wrong', but instead it prints "doesn't work":

WITH 2 AS name
RETURN CASE name
    WHEN 3 THEN 'works'

It is currently hard for users to track which versions of dependencies they are getting and which versions they should use when adding extra dependencies to their projects. This results in code like this in our own example projects:

libraryDependencies ++= Seq(
        "com.lightbend.akka"     %% "akka-stream-alpakka-file"  % "1.1.2",
        "com.typesafe.akka"      %% "akka-http-spray-js

Is your feature request related to a problem? Please describe.
Today the user needs to deploy udf jars and reference data csvs manually to the blob location

Describe the solution you'd like
Enable the user to choose a file on a local disk which the web portal will then upload to the right location

I totally forgot we have machinery for this in the "muti program" tests. We can likely reuse this on the "external ctrl-c" tests as well!

streaming-data

Here are 297 public repositories matching this topic...

0xnr / awesome-bigdata

johnkerl / miller

benthosdev / benthos

online-ml / river

provectus / kafka-ui

RaRe-Technologies / smart_open

Problem description

pravega / pravega

microsoft / Trill

python-streamz / streamz

reugn / go-streams

zpl-c / zpl

joshday / OnlineStats.jl

scikit-multiflow / scikit-multiflow

Stratio / sparta

infoslack / awesome-kafka

swimos / swim

kLabUM / rrcf

radiantly / you-cant-download-this-image

memgraph / memgraph

lightbend / cloudflow

microsoft / data-accelerator

bbejeck / kafka-streams-in-action

guillermo-navas-palencia / optbinning

Chulong-Li / Real-time-Sentiment-Tracking-on-Twitter-for-Brand-Improvement-and-Trend-Recognition

ast-al / rangeless

goodboy / tractor

selimfirat / pysad

Western-OC2-Lab / PWPAE-Concept-Drift-Detection-and-Adaptation

GridProtectionAlliance / gsf

maraisr / meros

Improve this page

Add this topic to your repo