CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

I see comments suggesting adding this to understand how loops are being handled by numba, and in the their own FAQ (https://numba.pydata.org/numba-doc/latest/user/faq.html)

from llvmlite import binding as llvm
llvm.set_option('','--debug-only=loop-vectorize')

You would then create your njit function and run it, and I believe the idea is that it prints debug information about whether

I am working on creating a WandbCallback for Weights and Biases. I am glad that CatBoost has a callback system in place but it would be great if we can extend the interface.

The current callback only supports after_iteration that takes info. Taking inspiration from XGBoost callback system it would be great if we can have before iteration that takes info, before_training, and `after

Description

Calling vectorize with a non-None value for the signature parameter outputs this error message about the excluded parameter.

NotImplementedError: cupy.vectorize does not support `excluded` option currently.

Inspecting the code, it is obvious there is a copy-paste error and the 2nd error message should be change excluded to signature.

https://github.com/cupy/c

Is your feature request related to a problem? Please describe.
While reviewing PR #9817 to introduce DataFrame.diff, I noticed that it is restricted to acting on numeric types.

A time-series diff is probably a very common user need, if provided a series of timestamps and seeking the durations between observations.

Pandas supports diffs on non-numeric types like timestamps:

请问可以直接training tmfile出来吗? 因为tengine-convert-tool covert 会有error

tengine-lite library version: 1.4-dev
Get input tensor failed

或是有例子能training出下面tmfile 呢?
![Screenshot from 2021-05-27 07-01-46](https://user-images.githubusercontent.com/40915044/11

Current implementation of join can be improved by performing the operation in a single call to the backend kernel instead of multiple calls.

This is a fairly easy kernel and may be a good issue for someone getting to know CUDA/ArrayFire internals. Ping me if you want additional info.

First of all, great library!

I am having some confusion in understanding the role of values_first2 in set_difference_by_key as mentioned here.

In general terms, the result of set difference D = A - B will not contain values from B therefore values_first2 can never be part of D. So `val

Summary

I run the code from the tutorial https://docs.oneflow.org/master/parallelism/05_ddp.html
for 通过设置 SBP 做数据并行训练, but it turns out with

'MobileNetV2' object has no attribute 'to_global'

i also try define NeuralNetwork class using class NeuralNetwork(nn.Module):
and model = NeuralNetwork().to(DEVICE)
then try to use model.to_global to allocate the model to GPU clusters, but it

Report needed documentation

Report needed documentation
While the estimator guide offers a great breakdown of how to use many of the tools in api_context_managers.py, it would be helpful to have information right in the docstring during development to more easily understand what is actually going on in each of the provided functions/classes/methods. This is particularly important for

In order to test manually altered IR, it would be nice to have a --skip-compilation flag for futhark test, just like we do for futhark bench.

CUDA

Here are 3,395 public repositories matching this topic...

NVIDIA / nvidia-docker

hashcat / hashcat

kaldi-asr / kaldi

numba / numba

catboost / catboost

isl-org / Open3D

cupy / cupy

Description

chainer / chainer

hybridgroup / gocv

rapidsai / cudf

NVlabs / instant-ngp

OAID / Tengine

arrayfire / arrayfire

NVIDIA / thrust

Oneflow-Inc / oneflow

Summary

uber / aresdb

ROCm-Developer-Tools / HIP

rapidsai / cuml

Report needed documentation

Jittor / jittor

chrxh / alien

bytedance / lightseq

Celtoys / Remotery

NVIDIA / libcudacxx

NVIDIA / cuda-samples

diku-dk / futhark

dmlc / nnvm

graphistry / pygraphistry

NVIDIA / cutlass

NVIDIA / MinkowskiEngine

mp3guy / ElasticFusion

Related Topics