Conversation
associated type). Commit before switching to simply Clone.
to externality (does panic). Also inline will be way more touchy to implement than expected, see comment on macro non send AsyncExt with enum handle.
(except spawn that will call directly the extension).
implement missing externalities function now that things are in place.
now marker need to get remove when joining or killing
Also consider spliting in two different types.
- writes access. - no backend containing overlay and primitive to directly access child worker externalities instead.
|
Looking at block processing, a possibly must have usage would be per extrinsic worker. Allowing to build a block execution that with per extrinsic paralellism if possible. But the current design is not compatible for it and would need to be extended. Currently we only log or declare:
Note that since write include read in this case, write accesses are also mutually exclusive. In the extrinsic case, without too much analysis, there is two immediate issues:
Both when use with the previous design requires write access and are use (or almost use) in all extrinsic, So additional log (declaration) to allow that would be:
For those cases I firstly thought of bufferizing then reordering writes depending on 'spawn' order but everything is easier when using 'join' order. The question remains of keeping both 'write' and 'write only': both are redundant. But if we consider that 'write' usually TLDR; current semantic of write including read access seems too limiting, and write and read should probably be disjoint. |
This is fine. |
|
for the index and event logs, we would just have those be collected in-thread and passed out to the aggregator (the main thread) which would join and aggregate them manually. they wouldn't try to access storage for any of these globals directly. |
This is fine. this really sounds nice. |
|
I am splitting this PR in multiple smaller one, first part is #7978 , closing this one. |
This is a draft pr with some code around runtime workers.
The goal is to be able to access and later modify state from workers.
Design principles are:
design: see failure on transaction drop for 'ReadAt' mode).
Please note, that if we do force worker to Always join (and panic on a worker that did not join), we could considerably simplify implementation:
I think this could be a safe design principle but it also mean that we need to panic on drop from state-machine (not really difficult, just rather impacting and not really aligned with previous runtime worker pr \cc @NikVolf ).
Current modes of execution
This PR defines multiple state access mode, as a good way to start a discussion.
In practice only a few are really interesting.
Those are define in sp_exrternalities/src/lib.rs .
Note that in no case the child worker can see changes done by the parent worker during
execution, keeping thing deterministic that way (AtSpawn variant therefore can run incoherent code).
Stateless
The existing worker type, cannot access state, spawning a worker of a different
type is not doable.
ReadLastBlock
Read only access to previous block state, not usefull, but test the backend access
to state.
ReadAtSpawn
Read only access to the existing state when calling 'spawn'.
As all next worker state access mode:
Simply add the current state-machine changes in front of the state.
Involves change copy that depends on the number of running change (no share state).
Ensure that the transaction on join is the transaction at spawn or a transaction stacked
over it (see overlaychange/markers.rs).
ReadOptimistic
Will logs child worker read access and parent worker write access.
On join if there is conflict between both access no result is returned (and
user have to run worker without concurrency on its own).
see overlaychange/loggers.rs
So this should be default as we ensure everything did run over a coherent state (since change in parent
worker don't overlap with child accesses).
ReadDeclarative
Declare (single key or prefix) child worker read access on spawn.
If child worker access a variable out of the declaration we can panic or return None on join as for optimistic (in this case there is no early exit :().
see overlaychange/filters.rs
WriteAtSpawn
ReadWrite access to state as defined at spawn.
In case of conflict the changes from parent worker are overwritten by child worker changes on join.
So this is deterministic but can result in some incoherent changes.
WriteLightOptimistic
In this case we only forbid to write both in parent and child.
So it can still be incoherent regarding read access.
WriteLightDeclarative
We need to declare child write accesses.
WriteOptimistic
To be coherent, we check parent read against child write, parent write against child write and parent write against child read.
This is maybe the only usefull mode.
WriteDeclarative
We need to declare child read and write accesses.
TODO list
[x] Reuse worker instance (only locally to an execution), and add some capacity limit.
We simply added with multiple consumer and a single sender.
Note that we reuse existing worker ids to manage that queue. For this reason, now native also uses these ids
and cannot be call directly without externalities.
[x] Allow running without any thread creation (what happen when capacity limit is reach).
[x] Different worker capability (the current model got a caveat: nested call cannot spawn worker with higher capability).
The read mode over previous block state is more here to test and should be remove I think.
https://github.com/cheme/substrate/blob/cba0105459699e43813ca8d9b78b13b330e97ff5/primitives/tasks/src/lib.rs#L74
[x] Access storage from worker.
[x] Ability to use with cumulus (see substrate runtime test). The 'spawn' implementation do not create threads but
run the process lazilly by casting back the wasm fn pointer. Also note that we do not catch panic (this explains
that I don't consider silencing panic result from workers).
https://github.com/cheme/substrate/blob/cba0105459699e43813ca8d9b78b13b330e97ff5/test-utils/runtime/src/lib.rs#L1206
[x] Create draft pr.
[] Read again and update docs (some late design change may not be in the docs).
[x] Read only with access restricted to untouched values
(join returning none on concurrent access)
[x] Read only with access restricted to declared values/prefix
(panicking version, parent panic on write of such values, child panic on write of undeclared values)
[x] Write with access/write restricted to untouched values
[x] Write only with access/write restricted to declared values/prefix
[] Rewrite optimistic logger for performance
[] Allow to spawn additional worker type from main ones. Especially 'declared' from their 'optimistic' variant and the other way around (need to avoid runing optimistic at all in some case). see function 'guard_compatible_child_workers' for current support.
[] Try to find a better way to handle unsafe call to externality: the fact that 'spawn' is an externalities extension
that itself uses externalities is currently using an unsafe call. In practice it is only an issue if the code does modify
the extension (unregistering especially). So another idea would be to encapsulate it in an ext implementation
with locked access to register/unregister extension for a specific TypeId.
[] Restore tests from sp_tasks: the native spawn now need to query the the spawn extension from externalities (it
was less code). Core issue here is that the default implementation is at 'client' level and moving it in 'primitives'
requires moving some wasm trait from 'client common' which currently got error with many deps. I initially did it
but code got quickly hard to read (lot of no_std caveats).
[] Configurable capacity (number of worker per execution), currently we do not really limit worker (but mechanism are in place
to be able to).
[] write more test on capacity management, especially trying to deadlock (currently we reduce nested capacity by the depth of
nested calls, (seems sound, but at some point I did consider it unsound and can't remember why)).
[] execution strategy both is not really doing what I would like in 'test-utils': find a way to test native without switching to wasm on error.
[] more tests
[] consider allowing more externalities from worker: root should be doable but this would leak hash trait everywhere, at
this point it does not seems worth it, same issue with 'storage_hash'.
[] Concurrent read change overlay: at this point change overlay is cloned on spawn.
I will probably skip this point because the design I have in mind is similar to the transactional design,
that wasn't merge in a previous pr and this will got the same issue (a bit more since it is slightly more
complex due to the fact that the index will be two dimensions). Still very doable.
[] dyn_clonable should work with no_std
[] find a way to kill running threads with dismiss: this is a bit out of scope of this PR, but it is really not easy.
At this point dismiss is still useful to avoid running queued worker that did not start when a thread did panic.
Actually that may be useless in this case (panic does panic parent worker).
It is just usefull when we call it from runtime then (eg some result in parent thread tells me the worker result is
not needed anymore).
So could also consider dropping some complexity there (the dismiss handler). It is already behind 'abort-future' feature.