Single Operator Execution Interface #4453

orausch · 2020-07-08T12:40:37Z

Description: This PR adds an interface to the C ABI that enables the execution of single ONNX nodes, without the overhead of graph construction and memory allocation

Motivation and Context
The alternative way to execute single operators/nodes is to create an ONNX graph containing a single node only. However, this (understandably) adds a lot of overhead, as can be seen in the plot below.

Here is an example of how the API can be used (UPDATE: new api for adding attributes):

// Setup the context
OrtExecutableKernelContext* kernel_context;
CheckStatus(OrtApi->CreateExecutableKernelContext("MyConv", "Conv", &kernel_context));

// Add parameter X
CheckStatus(OrtApi->ExecutableKernelContext_AddInput(kernel_context, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT));
// Add parameter W
CheckStatus(OrtApi->ExecutableKernelContext_AddInput(kernel_context, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT));
// Add parameter Y
CheckStatus(OrtApi->ExecutableKernelContext_AddOutput(kernel_context, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT));

// Setup attributes
CheckStatus(OrtApi->ExecutableKernelContext_AddAttributeString(kernel_context, "auto_pad", "NOTSET"));
CheckStatus(OrtApi->ExecutableKernelContext_AddAttributeInt(kernel_context, "group", 1));
{
	// Setup attribute strides
	int64_t values[2];
	values[0] = 2;
	values[1] = 2;

	CheckStatus(OrtApi->ExecutableKernelContext_AddAttributeInts(kernel_context, "strides", values, 2));
}
// Create the executable kernel
OrtExecutableKernel* kernel;
CheckStatus(OrtApi->CreateExecutableKernel(__ort_session, kernel_context, /*provider_index=*/0, &kernel));

// Execute the kernel
CheckStatus(OrtApi->ExecutableKernel_SetInput(kernel, 0, ort_value_input_X));
CheckStatus(OrtApi->ExecutableKernel_SetInput(kernel, 1, ort_value_input_W));
CheckStatus(OrtApi->ExecutableKernel_SetOutput(kernel, 0, ort_value_output_Y));
CheckStatus(OrtApi->ExecutableKernel_Compute(kernel));

It has been tested with the CPU and CUDA execution providers.


          Add C API for single kernel execution

The current design implements a new ExecutionFrame that is used to execute the op. This is less than optimal; I will attempt to change this in the future. The API will also have to be extended to add other providers than CPU.


          Share graph between ops, use set instead of add.

With this change, the ExecutableKernelContextImpl is initalized at kernel creation, and not at compute time, which should remove some overhead. This allows multiple calls with different datato be made using the same kernel. Furthermore, the main graph of the different op kernels is now shared through OrtKernelSession.


          Cleanup provider setup

* Reuse provides across Kernels * Support CUDA providers


          Add release functions for KernelSession and ExecutableKernelContext

microsoft-cla · 2020-07-08T12:40:50Z

All CLA requirements met.

Craigacp · 2020-07-08T13:17:19Z

This looks interesting to expose into the Java API, but is there a reason why the input and output arguments are specified separately from the call to compute? In session.run they are supplied to that call, and I feel like that maps a little more naturally.

orausch · 2020-07-08T13:29:58Z

This looks interesting to expose into the Java API, but is there a reason why the input and output arguments are specified separately from the call to compute? In session.run they are supplied to that call, and I feel like that maps a little more naturally.

I'm not particularly tied to this exact API; if exposed to the Java or Python API, it could be implemented similarly to session.run. The current option was just easy implement, and it has the performance advantage of not creating any intermediate data structures.


          Remove protobuf void pointer and std::map pointer from C API


          Move class methods out of header file


          Add implementations for Attribute setters


          Move AddNode back to private section in graph.h


          Cleanup and formatting


          Add error when node has no outputs


          Give Nodeargs unique names


          Improve error message for incorrect provider


          Add API to check if outputs will be on Cpu

hariharans29 · 2020-08-13T21:15:06Z

/azp run Linux CPU CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,MacOS NoContribops CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline

azure-pipelines · 2020-08-13T21:15:10Z

Pull request contains merge conflicts.

hariharans29 · 2020-08-13T21:15:12Z

/azp run orttraining-linux-ci-pipeline,orttraining-mac-ci-pipeline,orttraining-linux-gpu-ci-pipeline,centos7_cpu,Linux OpenVINO CI Pipeline

azure-pipelines · 2020-08-13T21:15:17Z

Pull request contains merge conflicts.


          Merge remote-tracking branch 'remotes/upstream/master' into master

# Conflicts: # include/onnxruntime/core/session/onnxruntime_c_api.h # onnxruntime/core/session/onnxruntime_c_api.cc # onnxruntime/core/session/ort_apis.h


          Re-add training whitespace

faxu · 2020-08-17T22:31:31Z

/azp run Windows GPU CI Pipeline, WIndows GPU TensorRT CI Pipeline, centos7_cpu, centos7_cpu (linux_centos_ci Debug), centos7_cpu (linux_centos_ci Release), orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline

faxu · 2020-08-17T22:31:47Z

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Windows CPU CI Pipeline

azure-pipelines · 2020-08-17T22:31:55Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2020-08-17T22:32:23Z

Azure Pipelines successfully started running 8 pipeline(s).


          Fix clang unused attribute warning


          Change std::make_unique -> onnxruntime::make_unique


          Fix windows unused parameter warning

faxu · 2020-08-18T16:57:46Z

/azp run Windows GPU CI Pipeline, WIndows GPU TensorRT CI Pipeline, centos7_cpu, centos7_cpu (linux_centos_ci Debug), centos7_cpu (linux_centos_ci Release), orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline

faxu · 2020-08-18T16:57:55Z

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Windows CPU CI Pipeline

azure-pipelines · 2020-08-18T16:58:14Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2020-08-18T16:58:33Z

Azure Pipelines successfully started running 8 pipeline(s).


          More warnings fixes for CI

faxu · 2020-08-19T21:29:43Z

@orausch are you planning more updates? Let me know when the CIs are ready to be run.


          Add option to specify opset version.

This fixes some issues with node schema resolution


          Merge remote-tracking branch 'upstream/master' into master

orausch · 2020-08-26T13:26:20Z

@faxu after the latest changes I believe that the PR should pass the CI (I managed to build it on Linux and Windows using build.bat and build.sh).

faxu · 2020-08-26T22:55:59Z

/azp run Windows GPU CI Pipeline, WIndows GPU TensorRT CI Pipeline, centos7_cpu, centos7_cpu (linux_centos_ci Debug), centos7_cpu (linux_centos_ci Release), orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline

faxu · 2020-08-26T22:56:08Z

/azp run Linux CPU CI Pipeline, Linux CPU x64 NoContribops CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, MacOS NoContribops CI Pipeline, Windows CPU CI Pipeline

azure-pipelines · 2020-08-26T22:56:22Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2020-08-26T22:56:43Z

Azure Pipelines successfully started running 8 pipeline(s).

jywu-msft · 2020-08-31T02:37:44Z

This seems pretty useful. +@RyanUnderhill @pranavsharma


          Add IsInputOnCpu


          Properly initialize data transfer manager

codemzs · 2020-09-21T05:48:08Z

Hi @jywu-msft / @orausch either we get traction on this PR or we close it. Can you please drive this to closure? it has been outstanding for a while.

orausch · 2020-09-22T09:35:22Z

Thanks for following up @codemzs. I think a good next step would be to get a review in from someone on the ORT team.

Let me know if there is any other way I can help drive this forward.

codemzs · 2020-09-22T16:03:32Z

@orausch I believe Pranav from ORT team will be looking at this.

orausch added 4 commits May 15, 2020

Add C API for single kernel execution

77d96e5

The current design implements a new ExecutionFrame that is used to execute the op. This is less than optimal; I will attempt to change this in the future. The API will also have to be extended to add other providers than CPU.

Cleanup provider setup

5f680ef

* Reuse provides across Kernels * Support CUDA providers

Add release functions for KernelSession and ExecutableKernelContext

Loading status checks…

4cb3a05

orausch requested a review from microsoft/onnxruntime as a code owner Jul 8, 2020

liqunfu added the api:C/C++ label Aug 3, 2020

orausch added 9 commits Aug 5, 2020

Remove protobuf void pointer and std::map pointer from C API

80f9443

Move class methods out of header file

daafc11

Add implementations for Attribute setters

Loading status checks…

8094b8c

Move AddNode back to private section in graph.h

Loading status checks…

f088680

Cleanup and formatting

Loading status checks…

5944b18

Add error when node has no outputs

Loading status checks…

26815ff

Give Nodeargs unique names

Loading status checks…

527b614

Improve error message for incorrect provider

Loading status checks…

26b068f

Add API to check if outputs will be on Cpu

Loading status checks…

e27b502

orausch added 2 commits Aug 17, 2020

Merge remote-tracking branch 'remotes/upstream/master' into master

Loading status checks…

f2b8581

# Conflicts: # include/onnxruntime/core/session/onnxruntime_c_api.h # onnxruntime/core/session/onnxruntime_c_api.cc # onnxruntime/core/session/ort_apis.h

Re-add training whitespace

Loading status checks…

7ef3799

orausch added 2 commits Aug 18, 2020

Fix clang unused attribute warning

Loading status checks…

c40b581

Change std::make_unique -> onnxruntime::make_unique

Loading status checks…

5eca53b

Fix windows unused parameter warning

Loading status checks…

d93ed4e

More warnings fixes for CI

Loading status checks…

2ca8db5

orausch added 2 commits Aug 26, 2020

Add option to specify opset version.

Loading status checks…

55a1b65

This fixes some issues with node schema resolution

Merge remote-tracking branch 'upstream/master' into master

Loading status checks…

aba7f07

jywu-msft requested a review from RyanUnderhill Aug 31, 2020

orausch added 2 commits Aug 31, 2020

Add IsInputOnCpu

1e0bbfa

Properly initialize data transfer manager

Loading status checks…

4dafe50

microsoft / onnxruntime

Single Operator Execution Interface #4453

Single Operator Execution Interface #4453

orausch commented Jul 8, 2020 •

edited

microsoft-cla bot commented Jul 8, 2020 •

edited

Craigacp commented Jul 8, 2020

orausch commented Jul 8, 2020

hariharans29 commented Aug 13, 2020

azure-pipelines bot commented Aug 13, 2020

hariharans29 commented Aug 13, 2020

azure-pipelines bot commented Aug 13, 2020

faxu commented Aug 17, 2020

faxu commented Aug 17, 2020

azure-pipelines bot commented Aug 17, 2020

azure-pipelines bot commented Aug 17, 2020

faxu commented Aug 18, 2020

faxu commented Aug 18, 2020

azure-pipelines bot commented Aug 18, 2020

azure-pipelines bot commented Aug 18, 2020

faxu commented Aug 19, 2020

orausch commented Aug 26, 2020 •

edited

faxu commented Aug 26, 2020

faxu commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

jywu-msft commented Aug 31, 2020

codemzs commented Sep 21, 2020

orausch commented Sep 22, 2020

codemzs commented Sep 22, 2020

microsoft / onnxruntime

Join GitHub today

Single Operator Execution Interface #4453

Single Operator Execution Interface #4453

Conversation

orausch commented Jul 8, 2020 • edited

microsoft-cla bot commented Jul 8, 2020 • edited

Craigacp commented Jul 8, 2020

orausch commented Jul 8, 2020

hariharans29 commented Aug 13, 2020

azure-pipelines bot commented Aug 13, 2020

hariharans29 commented Aug 13, 2020

azure-pipelines bot commented Aug 13, 2020

faxu commented Aug 17, 2020

faxu commented Aug 17, 2020

azure-pipelines bot commented Aug 17, 2020

azure-pipelines bot commented Aug 17, 2020

faxu commented Aug 18, 2020

faxu commented Aug 18, 2020

azure-pipelines bot commented Aug 18, 2020

azure-pipelines bot commented Aug 18, 2020

faxu commented Aug 19, 2020

orausch commented Aug 26, 2020 • edited

faxu commented Aug 26, 2020

faxu commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

jywu-msft commented Aug 31, 2020

codemzs commented Sep 21, 2020

orausch commented Sep 22, 2020

codemzs commented Sep 22, 2020

orausch commented Jul 8, 2020 •

edited

microsoft-cla bot commented Jul 8, 2020 •

edited

orausch commented Aug 26, 2020 •

edited