Feel free to merge my fork, about 20% faster on my computer (Ryzen 7 5700G CPU, medium.en model): https://github.com/Const-me/whisper.cpp It also contains VS2022 projects to build on Windows, your cmake project results in disabled AVX which is critical for performance.
Also, I didn’t really understand your multithreading code in ggml_graph_compute function, but that custom thread pool implementation IMO looks suspicious. Just too many atomics. Might be possible to improve a lot with a better multithreading strategy.
Also, I didn’t really understand your multithreading code in ggml_graph_compute function, but that custom thread pool implementation IMO looks suspicious. Just too many atomics. Might be possible to improve a lot with a better multithreading strategy.