Issues: hpcaitech/ColossalAI
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[FEATURE]: I hope to support AMD's GPU in the future
enhancement
New feature or request
#3451
opened Apr 5, 2023 by
k503394469
[BUG]: ImportError: cannot import name 'ColoInitContext' from 'colossalai.zero'
bug
Something isn't working
#3447
opened Apr 4, 2023 by
Luoyang144
[BUG]: train loss is nan at stage 1
bug
Something isn't working
#3444
opened Apr 4, 2023 by
boundles
[BUG]: iter error when use ptx_coef in ppo:Expected input batch_size (98) to match target batch_size (64).
bug
Something isn't working
#3443
opened Apr 4, 2023 by
baibaiw5
[FEATURE]: Adapt the booster plugins with the refactored chekcpointIO API
enhancement
New feature or request
#3441
opened Apr 4, 2023 by
FrankLeeeee
[BUG]: Is it normal to have loss nan after the Stage 1 - Supervised Finetuning?
bug
Something isn't working
#3439
opened Apr 4, 2023 by
alibabadoufu
[BUG]: "OverflowError: int too big to convert" during SFT training using Bloom model
bug
Something isn't working
#3438
opened Apr 4, 2023 by
chengeharrison
[test] reorganize tests related to zero/gemini
gemini
related to the gemini feature
testing
related to our testing
#3437
opened Apr 4, 2023 by
ver217
[BUG]: META_COMPATIBILITY is not defined
bug
Something isn't working
#3435
opened Apr 4, 2023 by
mcc311
[BUG]: Fetching wrong data from pretrained dataloader when ptx_coef is not zero in Staging 3 training
bug
Something isn't working
#3432
opened Apr 4, 2023 by
yynil
[BUG]: WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 73958 closing signal SIGTERM
bug
Something isn't working
#3425
opened Apr 4, 2023 by
jialesmu
[BUG]: training GPT2-S using a single card on colab, AssertionError: You should use Something isn't working
zero_ddp_wrapper first
bug
#3423
opened Apr 4, 2023 by
LivinLuo1993
[BUG]: bug in training rm with ddp strategy with single machine multi-GPUs!
bug
Something isn't working
#3421
opened Apr 4, 2023 by
xHansonx
[BUG]: Incompatible between colossalai_zero2 and LoRA tuning
bug
Something isn't working
#3419
opened Apr 4, 2023 by
zhangliang-04
[BUG]: OSError: It looks like the config file at './pretrain/pytorch_model.bin' is not a valid JSON file. 这是什么问题
bug
Something isn't working
#3416
opened Apr 3, 2023 by
xtu-xiaoc
[BUG]: mat1 and mat2 shapes cannot be multiplied (308x1024 and 768x320)
bug
Something isn't working
#3414
opened Apr 3, 2023 by
Happenmass
[BUG]: fp32 param and grad have different shape torch.Size([5064704]) vs torch.Size([128000]) when use lora_rank=4 at stage 1
bug
Something isn't working
#3412
opened Apr 3, 2023 by
boundles
[BUG]: Stuck at the stage 1 training for Chat application
bug
Something isn't working
#3407
opened Apr 3, 2023 by
alibabadoufu
why the labels are the same as the inputids in SFTDataset?
#3406
opened Apr 3, 2023 by
RankKCodeTalker
[BUG]: duplicate registrations for aten.convolution_backward.default
bug
Something isn't working
#3404
opened Apr 3, 2023 by
TrueNobility303
[BUG]: Cannot use pipeline and gemini at the same time
bug
Something isn't working
#3403
opened Apr 2, 2023 by
liuzeming-yuxi
[BUG]: Actor implementation might be buggy
bug
Something isn't working
#3402
opened Apr 2, 2023 by
yynil
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.