https://www.reddit.com/r/LocalLLaMA/comments/1iq6ngx/ktransformers_21_and_llamacpp_comparison_with/
https://github.com/ubergarm/r1-ktransformers-guide
Q4で9token/sまで到達出来てるらしい。Intel AMX対応CPUなら更に伸びるみたいだね。