|
numerics 0.1.0
|
Tier-1 kernel: raw-pointer, inline, zero-overhead inner loops. More...
Go to the source code of this file.
Namespaces | |
| namespace | num |
| namespace | num::kernel |
| namespace | num::kernel::raw |
Macros | |
| #define | NUM_K_AINLINE inline |
| #define | NUM_K_RESTRICT |
| #define | NUM_K_IVDEP |
Functions | |
| NUM_K_AINLINE void | num::kernel::raw::scale (real *NUM_K_RESTRICT x, real alpha, idx n) noexcept |
| x[i] *= alpha | |
| NUM_K_AINLINE void | num::kernel::raw::axpy (real *NUM_K_RESTRICT y, const real *NUM_K_RESTRICT x, real alpha, idx n) noexcept |
| y[i] += alpha * x[i] | |
| NUM_K_AINLINE void | num::kernel::raw::axpby (real *NUM_K_RESTRICT y, const real *NUM_K_RESTRICT x, real a, real b, idx n) noexcept |
| y[i] = a*x[i] + b*y[i] (fused scale-and-add, one memory pass) | |
| NUM_K_AINLINE void | num::kernel::raw::axpbyz (real *NUM_K_RESTRICT z, const real *NUM_K_RESTRICT x, const real *NUM_K_RESTRICT y, real a, real b, idx n) noexcept |
| z[i] = a*x[i] + b*y[i] (fused, three-array, one pass each) | |
| NUM_K_AINLINE real | num::kernel::raw::dot (const real *NUM_K_RESTRICT x, const real *NUM_K_RESTRICT y, idx n) noexcept |
| dot product: return sum x[i] * y[i] | |
| NUM_K_AINLINE real | num::kernel::raw::norm_sq (const real *NUM_K_RESTRICT x, idx n) noexcept |
| sum x[i]^2 (no sqrt; use for convergence checks to avoid sqrt cost) | |
| NUM_K_AINLINE real | num::kernel::raw::norm (const real *NUM_K_RESTRICT x, idx n) noexcept |
| Euclidean norm: sqrt(sum x[i]^2) | |
| NUM_K_AINLINE real | num::kernel::raw::l1_norm (const real *NUM_K_RESTRICT x, idx n) noexcept |
| L1 norm: sum |x[i]|. | |
| NUM_K_AINLINE real | num::kernel::raw::linf_norm (const real *NUM_K_RESTRICT x, idx n) noexcept |
| L-infinity norm: max |x[i]|. | |
| NUM_K_AINLINE real | num::kernel::raw::sum (const real *NUM_K_RESTRICT x, idx n) noexcept |
| Scalar sum: return sum x[i]. | |
| NUM_K_AINLINE void | num::kernel::raw::matvec (real *NUM_K_RESTRICT y, const real *NUM_K_RESTRICT A, const real *NUM_K_RESTRICT x, idx m, idx n) noexcept |
| y[i] = sum_j A[i*n + j] * x[j] (m x n row-major matrix) | |
| NUM_K_AINLINE void | num::kernel::raw::ger (real *NUM_K_RESTRICT A, const real *NUM_K_RESTRICT x, const real *NUM_K_RESTRICT y, real alpha, idx m, idx n) noexcept |
| Rank-1 update: A[i*n + j] += alpha * x[i] * y[j] (m x n row-major) | |
| NUM_K_AINLINE void | num::kernel::raw::trsv_lower (real *NUM_K_RESTRICT x, const real *NUM_K_RESTRICT L, const real *NUM_K_RESTRICT b, idx n) noexcept |
| Forward substitution: solve Lx = b, L lower triangular (n x n, row-major). | |
| NUM_K_AINLINE void | num::kernel::raw::trsv_upper (real *NUM_K_RESTRICT x, const real *NUM_K_RESTRICT U, const real *NUM_K_RESTRICT b, idx n) noexcept |
| Back substitution: solve Ux = b, U upper triangular (n x n, row-major). | |
Tier-1 kernel: raw-pointer, inline, zero-overhead inner loops.
These are the lowest-level building blocks in the numerics stack. Every higher tier (type-safe wrappers, operator interface, solvers) ultimately reduces to these loops.
Design invariants:
Operations: — Level 1 (vector) — scale(x, alpha, n) x[i] *= alpha axpy(y, x, alpha, n) y[i] += alpha * x[i] axpby(y, x, a, b, n) y[i] = a*x[i] + b*y[i] [fused, no BLAS eq.] axpbyz(z, x, y, a, b, n) z[i] = a*x[i] + b*y[i] [fused, no BLAS eq.] dot(x, y, n) return sum x[i]*y[i] norm_sq(x, n) return sum x[i]^2 [no BLAS sqrt cost] norm(x, n) return sqrt(norm_sq) l1_norm(x, n) return sum |x[i]| linf_norm(x, n) return max |x[i]| sum(x, n) return sum x[i] [no BLAS eq.]
— Level 2 (matrix-vector, row-major) — matvec(y, A, x, m, n) y[i] = sum_j A[i*n+j] * x[j] ger(A, x, y, alpha, m, n) A[i*n+j] += alpha * x[i] * y[j] trsv_lower(x, L, b, n) solve Lx = b (L lower triangular) trsv_upper(x, U, b, n) solve Ux = b (U upper triangular)
Include this header when writing a fused custom kernel that needs to compose multiple operations into a single memory pass. For ordinary user code prefer num::kernel:: (Tier 2).
Definition in file raw.hpp.