Not AI
I like big .vimrc and I cannot lie
- Sofia, Bulgaria
-
06:20
(UTC +03:00) - https://ggerganov.com
- @ggerganov
- user/ggerganov
Sponsors
Block or Report
Block or report ggerganov
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
1,790 contributions in the last year
Less
More
Contribution activity
June 2023
Created 22 commits in 3 repositories
Created a pull request in ggerganov/llama.cpp that received 2 comments
k-quants : allow to optionally disable at compile time
The new quantization types Q2_K, Q3_K, Q4_K, Q5_K and Q6_K can now optionally be disabled at compile time:
make:
LLAMA_NO_K_QUANTS=1 make
CMake:
cm…
+251
−229
•
2
comments
Opened 2 other pull requests in 2 repositories
ggerganov/llama.cpp
1
merged
ggerganov/ggml
1
open
Reviewed 19 pull requests in 2 repositories
ggerganov/llama.cpp
15 pull requests
- metal : add Q2_K implementation
- Q6_K implementation for Metal
- Fix warning on fprintf
- Q4_K implementation for Metal
- Add missing compile definition to CMakeLists for k_quants
- update flake to support metal on m1/m2
- Multi GPU support, CUDA refactor, CUDA scratch buffer
- Add checks for buffer size with Metal
- docs(performance): Add performance troubleshoot + example benchmark documentation
- fix small typo in README.md
- Share buffers between CPU and GPU
- ggml: Fix internal overflow in ggml_time_us on Windows
- k-quants
- Cuda refactor, multi GPU support
- llama : Metal inference
Started 1 discussion in 1 repository
ggerganov/llama.cpp
ggerganov/llama.cpp
-
Roadmap June 2023
This contribution was made on Jun 7






