Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_runner: fix it concurrency #43757

Merged
merged 2 commits into from Jul 15, 2022

Conversation

MoLow
Copy link
Member

@MoLow MoLow commented Jul 10, 2022

No description provided.

@nodejs-github-bot nodejs-github-bot added needs-ci test_runner labels Jul 10, 2022
@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 10, 2022

CC @aduh95

lib/internal/test_runner/test.js Outdated Show resolved Hide resolved
@MoLow MoLow force-pushed the test-runner-fix-it-concurrency branch from 238a705 to ba13d84 Compare Jul 10, 2022
aduh95
aduh95 approved these changes Jul 10, 2022
@aduh95
Copy link
Contributor

@aduh95 aduh95 commented Jul 10, 2022

/cc @nodejs/test_runner

@aduh95 aduh95 added author ready request-ci labels Jul 10, 2022
@github-actions github-actions bot removed the request-ci label Jul 10, 2022
@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 10, 2022

lpinca
lpinca approved these changes Jul 10, 2022
@aduh95
Copy link
Contributor

@aduh95 aduh95 commented Jul 10, 2022

Hum it looks like it makes message/test_runner_desctibe_it flaky. @MoLow do you know where this might be coming from?

=== release test_runner_desctibe_it.js ===
Path: message/test_runner_desctibe_it.js
Error: --- stderr ---
node:internal/test_runner/harness:29
      throw err;
      ^
TypeError: Cannot mix BigInt and other types, use explicit conversions
    at ItTest.report (node:internal/test_runner/test:425:42)
    at ItTest.finalize (node:internal/test_runner/test:418:10)
    at Test.processReadySubtestRange (node:internal/test_runner/test:195:15)
    at ItTest.postRun (node:internal/test_runner/test:388:19)
    at Test.postRun (node:internal/test_runner/test:370:17)
    at process.exitHandler (node:internal/test_runner/harness:79:10)
    at process.emit (node:events:513:28)
Node.js v19.0.0-pre

@aduh95 aduh95 removed the author ready label Jul 10, 2022
@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 10, 2022

Hum it looks like it makes message/test_runner_desctibe_it flaky. @MoLow do you know where this might be coming from?

@aduh95
yes, it happens when a test is canceled and was not yet assigned a start time. you can see I already fixed that here
adding concurrency introduced a timing condition that made it more likely to happen since previously when a suite ran serially - either a child test did not run or it did, it was never queued.

I suggest waiting for the fix to land, then rebase and make sure test is stable in this branch

@aduh95 aduh95 added the blocked label Jul 10, 2022
@MoLow MoLow force-pushed the test-runner-fix-it-concurrency branch from 2964c88 to 8d66424 Compare Jul 12, 2022
@aduh95 aduh95 added author ready request-ci commit-queue-squash and removed blocked labels Jul 12, 2022
@github-actions github-actions bot removed the request-ci label Jul 12, 2022
@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 12, 2022

@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 12, 2022

@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 12, 2022

@juliangruber
Copy link
Member

@juliangruber juliangruber commented Jul 12, 2022

I'm surprised by this, is it actually expected to run it tests in parallel?

@ljharb
Copy link
Member

@ljharb ljharb commented Jul 12, 2022

I don't think it is - jest runs files in parallel (as does mocha, i believe), but i'm not aware of anything that runs individual its in parallel.

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 13, 2022

perhaps this should be a feature that is off by default?

@JakobJingleheimer
Copy link
Contributor

@JakobJingleheimer JakobJingleheimer commented Jul 14, 2022

where tests pass in isolation but would fail when ran together.

Not to derail this PR: that sounds an XY problem: either the test is bad or the technical design of the implementation being tested is flawed. In 20 years of engineering, the root-cause of these kinds of problems has always been one of those.

@lpinca
Copy link
Member

@lpinca lpinca commented Jul 14, 2022

Even Node.js tools/test.py allows to specify the number of parallel tasks to run via the -j option and it make sense to me. I want control over this.

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 14, 2022

I think it should be opt-out: tests should be isolated. If parallelism causes a problem, you have a different problem that parallelism is merely revealing.

I just want to point out that the scope of this discussion should also cover test and not only it - this PR aligns it to behave the same as test, and their defaults probably should be the same.

@JakobJingleheimer
Copy link
Contributor

@JakobJingleheimer JakobJingleheimer commented Jul 14, 2022

Even Node.js tools/test.py allows to specify the number of parallel tasks to run via the -j option and it make sense to me. I want control over this.

I'll open a separate PR to add a true option. We don't need to choose one or the other.

Sorry, thank you for the fix @MoLow.

@lpinca
Copy link
Member

@lpinca lpinca commented Jul 14, 2022

I'll open a separate PR to add a true option. We don't need to choose one or the other.

We are going off-topic but I don't see the need for it. -j 0 or -j omitted is equivalent to your true, that is use all cores, but again I want to be able to specify 1,2,...n, not all or nothing.

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 14, 2022

Even Node.js tools/test.py allows to specify the number of parallel tasks to run via the -j option and it makes sense to me. I want control over this.

I think everyone wants control over concurrency. the question is whether it should be opt-in or opt-out.
the question is if the node test runner favors speed or favors stability,
it is true that good tests should be isolated - but the fact is many (probably most) authors write some tests that are not (just take a look at the CI of this repo :/ ), so I think "test should be isolated" is not a valid argument here

@cjihrig
Copy link
Contributor

@cjihrig cjihrig commented Jul 14, 2022

My personal opinion is that within a single test file, things should happen serially by default. The CLI runner should execute files in parallel by default. Additional parallelism should be opt-in.

@JakobJingleheimer
Copy link
Contributor

@JakobJingleheimer JakobJingleheimer commented Jul 14, 2022

but again I want to be able to specify 1,2,...n, not all or nothing.

I think you misread/misunderstood my comment. I didn't propose removing the integer option. I proposed adding a true option. If you want to hassle with some number, fine, set concurrency: 4. I don't, so I set concurrency: true. Or are you saying I can currently set concurrency: 0 and that will make it as concurrent as possible?

the question is if the node test runner favors speed or favors stability,

It doesn't have to be a binary choice ;)

it is true that good tests should be isolated - but the fact is many (probably most) authors write some tests that are not (just take a look at the CI of this repo :/ )

I'm working on it #43784 I just need this PR's fix ;)

@ljharb
Copy link
Member

@ljharb ljharb commented Jul 14, 2022

A true option that automatically chooses "number of CPUs minus one" sounds awesome to me, regardless of the default.

@lpinca
Copy link
Member

@lpinca lpinca commented Jul 14, 2022

Ok I probably misunderstood. It was not clear that you wanted to add true. What I understood was that you wanted to turn the integer option into a boolean only one.

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 14, 2022

Since this PR conflicts main and needs to be updated anyway - I will add the conclusion gathered in this discussion and re-request code reviews

@MoLow MoLow force-pushed the test-runner-fix-it-concurrency branch from 8d66424 to 5eccc6b Compare Jul 14, 2022
@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 14, 2022

so the current implementation in this PR works this way:

  • in the level of a test/it the default concurrency is 1
  • when running --test, the default will be the number of CPUs, see:

// TODO(cjihrig): Use uv_available_parallelism() once it lands.
const rootConcurrency = isTestRunner ? cpus().length : 1;

  • I want to leave support for boolean value for the concurrency setting out of the scope of this PR since the number of CPUs as a default only makes sense to me at the file level since each file spawns its own process. inside a file this isn't necessarily a good default - I will open a separate issue to discuss this

@MoLow MoLow added the request-ci label Jul 14, 2022
@github-actions github-actions bot removed the request-ci label Jul 14, 2022
@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 14, 2022

@juliangruber
Copy link
Member

@juliangruber juliangruber commented Jul 15, 2022

so the current implementation in this PR works this way:

  • in the level of a test/it the default concurrency is 1

The code still looks like it runs all tests in parallel with unlimited concurrency, also the testTimeout has been removed

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 15, 2022

The code still looks like it runs all tests in parallel with unlimited concurrency, also the testTimeout has been removed

the default is currently 1, see #43757 (comment)

@MoLow
Copy link
Member Author

@MoLow MoLow commented Jul 15, 2022

also the testTimeout has been removed

I am not sure if timeout is needed in describe and if it should support an asynchronous function, anyway - I will sort that out in #43554

@benjamingr benjamingr added the commit-queue label Jul 15, 2022
@nodejs-github-bot nodejs-github-bot removed the commit-queue label Jul 15, 2022
@nodejs-github-bot nodejs-github-bot merged commit a3766bc into nodejs:main Jul 15, 2022
50 checks passed
@nodejs-github-bot
Copy link
Contributor

@nodejs-github-bot nodejs-github-bot commented Jul 15, 2022

Landed in a3766bc

@MoLow MoLow deleted the test-runner-fix-it-concurrency branch Jul 15, 2022
aduh95 pushed a commit to aduh95/node-core-test that referenced this issue Jul 16, 2022
PR-URL: nodejs/node#43757
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Reviewed-By: Jacob Smith <jacob@frende.me>
(cherry picked from commit a3766bc8a84bcd375372a0a5c8c9174dadb6817d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready commit-queue-squash needs-ci test_runner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet