numerics
Loading...
Searching...
No Matches
matrix.cpp File Reference

SIMD backend – hand-written AVX2/NEON matmul and matvec. More...

#include "core/matrix.hpp"
#include "../seq/impl.hpp"
#include <algorithm>
#include <cassert>

Go to the source code of this file.

Namespaces

namespace  num
 
namespace  num::backends
 
namespace  num::backends::simd
 

Functions

void num::backends::simd::matmul (const Matrix &A, const Matrix &B, Matrix &C, idx block_size)
 
void num::backends::simd::matvec (const Matrix &A, const Vector &x, Vector &y)
 

Detailed Description

SIMD backend – hand-written AVX2/NEON matmul and matvec.

Compile-time dispatch: NUMERICS_HAS_AVX2 -> AVX-256 + FMA (x86-64, 4 doubles/register) NUMERICS_HAS_NEON -> ARM NEON (AArch64, 2 doubles/register) neither -> falls back to cache-blocked scalar

Both backends use the same register-tile structure: Outer cache tile: ii -> jj -> kk (B tile stays in L2) Inner reg tile: 4 rows x 4 cols (AVX: 4 YMM regs; NEON: 8 Q-regs) Hot k loop: one vector FMA per row, zero loop overhead for j

Definition in file matrix.cpp.