I Just wanted to thank the MKL team for MKL_DIRECT_CALL. It works well with small matrices as well as large matrices. I am seeing a 2x performance improvement when the matrices are small. Previously I had my own _inline gemm for small matrices as calling MKL zgemm was inefficient, but the new feature has allowed me to clean up my code as well as speed it up....
↧