My repositories map
✔ Machine Learning: ML Engineering | ML ways | Porting
✔ Guides: The Art of Debugging
✔ Applications: ipyexperiments
✔ Tools and Cheatsheets: bash | conda | git | jupyter-notebook | make | python | tensorboard | unix
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuse✔ Machine Learning: ML Engineering | ML ways | Porting
✔ Guides: The Art of Debugging
✔ Applications: ipyexperiments
✔ Tools and Cheatsheets: bash | conda | git | jupyter-notebook | make | python | tensorboard | unix
Automatic GPU+CPU memory profiling, re-use and memory leaks detection using jupyter/ipython experiment containers
| Day of Week | December Dec | January Jan | February Feb | March Mar | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | ||||||||||||||||||||||||||||||||||||||||
| Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Saturday Sat | |||||||||||||||||||||||||||||||||||||||||||||||||||||
currently if the DL doesn't have batch_size set, because it uses batch_sampler the deepspeed plugin code fails to work.
The problem comes from here:
Cache abstraction and Attention Sinks support
🚀 The feature, motivation and pitch Why doesn't torchrun assign processes it launches using NUMA affinity? Data centers have 2 CPUs per node and se…