|
numerics 0.1.0
|
Namespaces | |
| namespace | backends |
| namespace | cuda |
| namespace | detail |
| namespace | kernel |
| namespace | markov |
| namespace | mpi |
| namespace | ode |
| namespace | pde |
| namespace | plt |
| namespace | spectral |
| namespace | viz |
Classes | |
| struct | BackwardEuler |
| class | BandedMatrix |
| Banded matrix with efficient storage for direct solvers. More... | |
| struct | BandedSolverResult |
| Result from banded solver. More... | |
| class | BasicVector |
| Dense vector with optional GPU storage, templated over scalar type T. More... | |
| class | CellList2D |
| class | CellList3D |
| struct | ClusterResult |
| struct | ComplexTriDiag |
| Constant-coefficient complex tridiagonal solver (precomputed LU). More... | |
| struct | CrankNicolsonADI |
| struct | EigenResult |
| Full eigendecomposition result: A = V * diag(values) * V^T. More... | |
| struct | Euler |
| class | EulerSteps |
| class | FieldSolver |
| struct | GivensRotation |
| Givens plane rotation: G = [c, s; -s, c]. More... | |
| class | Gnuplot |
| struct | Grid2D |
| class | Grid3D |
| struct | Histogram |
| struct | IntRange |
| Lightweight read-only range over a contiguous int array (C++17-safe span). More... | |
| struct | LanczosResult |
| Result of a Lanczos eigensolver. More... | |
| struct | LUResult |
| Result of an LU factorization with partial pivoting (PA = LU) More... | |
| class | MagneticSolver |
| class | Matrix |
| Dense row-major matrix with optional GPU storage. More... | |
| struct | MCMCProblem |
| struct | Metropolis |
| struct | ODEParams |
| Parameters for all ODE integrators. Set only the fields you need. More... | |
| struct | ODEProblem |
| Explicit ODE: du/dt = f(t, u) More... | |
| struct | ODEResult |
| struct | PBCLattice2D |
| 4-neighbor periodic-boundary index arrays for an NxN lattice. More... | |
| struct | PowerResult |
| Result of a single-eigenvalue iteration. More... | |
| struct | QRResult |
| Result of a QR factorization: A = Q * R. More... | |
| struct | RK4 |
| struct | RK45 |
| class | RK45Steps |
| class | RK4_2ndSteps |
| class | RK4Steps |
| struct | Rng |
| Seeded pseudo-random number generator (Mersenne Twister). Pass a pointer to rng_* functions to draw samples. More... | |
| struct | RootResult |
| struct | RunningStats |
| class | ScalarField2D |
| class | ScalarField3D |
| struct | Series |
| Ordered (x, y) series. More... | |
| struct | SmallMatrix |
| Constexpr fixed-size row-major matrix (stack-allocated). More... | |
| struct | SmallVec |
| Constexpr fixed-size dense vector (stack-allocated). More... | |
| struct | SolverResult |
| class | SparseMatrix |
| Sparse matrix in Compressed Sparse Row (CSR) format. More... | |
| struct | SPHKernel |
| Dimension-generic SPH smoothing kernels. Dim = 2 or 3. More... | |
| struct | Step |
| struct | StepEnd |
| struct | SVDResult |
| Result of a Singular Value Decomposition: A = U * diag(S) * V^T. More... | |
| struct | SymplecticResult |
| struct | SymplecticStep |
| struct | Vec2ConstView |
| Read-only variant of Vec2View for const Vectors. More... | |
| struct | Vec2View |
| Non-owning view of a flat Vector as an array of 2D points. More... | |
| struct | VectorField3D |
| class | VerletList2D |
| class | VerletSteps |
| class | Yoshida4Steps |
Concepts | |
| concept | VecField |
| concept | IsODEProblem |
| concept | IsExplicitODEAlg |
| concept | IsImplicitODEAlg |
| concept | IsMCMCAlg |
Typedefs | |
| using | ScalarFn = std::function< real(real)> |
| Real-valued scalar function f(x) | |
| using | VectorFn = std::function< void(real, real *, real *)> |
| Real-valued multivariate function f(x) where x is a scalar parameter and the function returns a vector of residuals – used in nonlinear systems. | |
| using | real = double |
| using | idx = std::size_t |
| using | cplx = std::complex< real > |
| using | Vector = BasicVector< real > |
| Real-valued dense vector with full backend dispatch (CPU + GPU) | |
| using | CVector = BasicVector< cplx > |
| Complex-valued dense vector (sequential; no GPU) | |
| using | MatVecFn = std::function< void(const Vector &, Vector &)> |
| Callable type for matrix-free matvec: computes y = A*x. | |
| using | LinearSolver = std::function< SolverResult(const Vector &rhs, Vector &x)> |
| using | ODERhsFn = std::function< void(real t, const Vector &y, Vector &dydt)> |
| using | AccelFn = std::function< void(const Vector &q, Vector &acc)> |
| using | ObserverFn = std::function< void(real t, const Vector &y)> |
| using | SympObserverFn = std::function< void(real t, const Vector &q, const Vector &v)> |
| using | Point = std::pair< double, double > |
| (x, y) data point. | |
Enumerations | |
| enum class | Backend { seq , blocked , simd , blas , omp , gpu , lapack } |
| Selects which backend handles a linalg operation. More... | |
Functions | |
| real | trapz (ScalarFn f, real a, real b, idx n=100, Backend backend=Backend::seq) |
| Trapezoidal rule with n panels. | |
| real | simpson (ScalarFn f, real a, real b, idx n=100, Backend backend=Backend::seq) |
| Simpson's 1/3 rule with n panels (n must be even) | |
| real | gauss_legendre (ScalarFn f, real a, real b, idx p=5) |
| Gauss-Legendre quadrature (exact for polynomials up to degree 2p-1) | |
| real | adaptive_simpson (ScalarFn f, real a, real b, real tol=1e-8, idx max_depth=50) |
| Adaptive Simpson quadrature. | |
| real | romberg (ScalarFn f, real a, real b, real tol=1e-10, idx max_levels=12) |
| Romberg integration (Richardson extrapolation on trapezoidal rule) | |
| RootResult | bisection (ScalarFn f, real a, real b, real tol=1e-10, idx max_iter=1000) |
| Bisection method. | |
| RootResult | newton (ScalarFn f, ScalarFn df, real x0, real tol=1e-10, idx max_iter=1000) |
| Newton-Raphson method. | |
| RootResult | secant (ScalarFn f, real x0, real x1, real tol=1e-10, idx max_iter=1000) |
| Secant method (Newton without derivative) | |
| RootResult | brent (ScalarFn f, real a, real b, real tol=1e-10, idx max_iter=1000) |
| Brent's method (bisection + secant + inverse quadratic interpolation) | |
| void | matvec (const Matrix &A, const Vector &x, Vector &y, Backend b=default_backend) |
| y = A * x | |
| void | matmul (const Matrix &A, const Matrix &B, Matrix &C, Backend b=default_backend) |
| C = A * B. | |
| void | matadd (real alpha, const Matrix &A, real beta, const Matrix &B, Matrix &C, Backend b=default_backend) |
| C = alpha*A + beta*B. | |
| void | matmul_blocked (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64) |
| C = A * B (cache-blocked) | |
| void | matmul_register_blocked (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64, idx reg_size=4) |
| C = A * B (register-blocked) | |
| void | matmul_simd (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64) |
| C = A * B (SIMD-accelerated) | |
| void | matvec_simd (const Matrix &A, const Vector &x, Vector &y) |
| y = A * x (SIMD-accelerated) | |
| template<idx N> | |
| constexpr SmallVec< N > | operator+ (SmallVec< N > a, const SmallVec< N > &b) noexcept |
| template<idx N> | |
| constexpr SmallVec< N > | operator* (real s, SmallVec< N > v) noexcept |
| template<class T > | |
| constexpr idx | to_idx (T x) noexcept |
| Cast any integer to idx without a verbose static_cast. | |
| template<int N, typename T > | |
| constexpr T | ipow (T x) noexcept |
| Compute x^N at compile time via repeated squaring. | |
| real | bessel_j (real nu, real x) |
| J_nu(x) – Bessel function of the first kind. | |
| real | bessel_y (real nu, real x) |
| Y_nu(x) – Bessel function of the second kind (Neumann function) | |
| real | bessel_i (real nu, real x) |
| I_nu(x) – modified Bessel function of the first kind. | |
| real | bessel_k (real nu, real x) |
| K_nu(x) – modified Bessel function of the second kind. | |
| real | sph_bessel_j (unsigned int n, real x) |
| j_n(x) – spherical Bessel function of the first kind | |
| real | sph_bessel_y (unsigned int n, real x) |
| y_n(x) – spherical Neumann function (spherical Bessel of the second kind) | |
| real | legendre (unsigned int n, real x) |
| P_n(x) – Legendre polynomial of degree n. | |
| real | assoc_legendre (unsigned int n, unsigned int m, real x) |
| P_n^m(x) – associated Legendre polynomial. | |
| real | sph_legendre (unsigned int l, unsigned int m, real theta) |
| Y_l^m(theta) – spherical harmonic (real part, theta in radians) | |
| real | hermite (unsigned int n, real x) |
| H_n(x) – (physicists') Hermite polynomial. | |
| real | laguerre (unsigned int n, real x) |
| L_n(x) – Laguerre polynomial. | |
| real | assoc_laguerre (unsigned int n, unsigned int m, real x) |
| L_n^m(x) – associated Laguerre polynomial. | |
| real | ellint_K (real k) |
| K(k) – complete elliptic integral of the first kind. | |
| real | ellint_E (real k) |
| E(k) – complete elliptic integral of the second kind. | |
| real | ellint_Pi (real n, real k) |
| Pi(n, k) – complete elliptic integral of the third kind. | |
| real | ellint_F (real k, real phi) |
| F(k, phi) – incomplete elliptic integral of the first kind. | |
| real | ellint_Ei (real k, real phi) |
| E(k, phi) – incomplete elliptic integral of the second kind. | |
| real | ellint_Pi_inc (real n, real k, real phi) |
| Pi(n, k, phi) – incomplete elliptic integral of the third kind. | |
| real | expint (real x) |
| Ei(x) – exponential integral. | |
| real | zeta (real x) |
| zeta(x) – Riemann zeta function | |
| real | beta (real a, real b) |
| B(a, b) – beta function. | |
| std::vector< real > | linspace (real start, real stop, idx n) |
| Evenly spaced values from start to stop, inclusive. MATLAB/NumPy linspace. | |
| std::vector< int > | int_range (int start, int n) |
| Integer sequence [start, start+1, ..., start+n-1]. Wraps std::iota. | |
| real | rng_uniform (Rng *r, real lo, real hi) |
| Uniform real in [lo, hi). | |
| real | rng_normal (Rng *r, real mean, real stddev) |
| Normal (Gaussian) sample with given mean and standard deviation. | |
| int | rng_int (Rng *r, int lo, int hi) |
| Uniform integer in [lo, hi] (inclusive on both ends). | |
| real | gaussian2d (real x, real y, real cx, real cy, real sigma) |
| 2D isotropic Gaussian centred at \((c_x, c_y)\) with width \(\sigma\): \(\exp\!\bigl(-\tfrac{(x-c_x)^2+(y-c_y)^2}{2\sigma^2}\bigr)\) | |
| void | scale (Vector &v, real alpha, Backend b=default_backend) |
| v *= alpha | |
| void | add (const Vector &x, const Vector &y, Vector &z, Backend b=default_backend) |
| z = x + y | |
| void | axpy (real alpha, const Vector &x, Vector &y, Backend b=default_backend) |
| y += alpha * x | |
| real | dot (const Vector &x, const Vector &y, Backend b=default_backend) |
| dot product | |
| real | norm (const Vector &x, Backend b=default_backend) |
| Euclidean norm. | |
| void | scale (CVector &v, cplx alpha) |
| v *= alpha | |
| void | axpy (cplx alpha, const CVector &x, CVector &y) |
| y += alpha * x | |
| cplx | dot (const CVector &x, const CVector &y) |
| Conjugate inner product <x, y> = Sigma conj(x_i) * y_i. | |
| real | norm (const CVector &x) |
| Euclidean norm sqrt(Sigma |v_i|^2) | |
| BandedSolverResult | banded_lu (BandedMatrix &A, idx *ipiv) |
| LU factorization of banded matrix with partial pivoting. | |
| void | banded_lu_solve (const BandedMatrix &A, const idx *ipiv, Vector &b) |
| Solve banded system using precomputed LU factorization. | |
| void | banded_lu_solve_multi (const BandedMatrix &A, const idx *ipiv, real *B, idx nrhs) |
| Solve multiple right-hand sides using LU factorization. | |
| BandedSolverResult | banded_solve (const BandedMatrix &A, const Vector &b, Vector &x) |
| Solve banded system Ax = b (convenience function) | |
| void | banded_matvec (const BandedMatrix &A, const Vector &x, Vector &y, Backend backend=default_backend) |
| Banded matrix-vector product y = A*x. | |
| void | banded_gemv (real alpha, const BandedMatrix &A, const Vector &x, real beta, Vector &y, Backend backend=default_backend) |
| Banded matrix-vector product y = alpha*A*x + beta*y. | |
| real | banded_rcond (const BandedMatrix &A, const idx *ipiv, real anorm) |
| Estimate reciprocal condition number of banded matrix. | |
| real | banded_norm1 (const BandedMatrix &A) |
| Compute 1-norm of banded matrix. | |
| EigenResult | eig_sym (const Matrix &A, real tol=1e-12, idx max_sweeps=100, Backend backend=lapack_backend) |
| Full eigendecomposition of a real symmetric matrix. | |
| LanczosResult | lanczos (MatVecFn matvec, idx n, idx k, real tol=1e-10, idx max_steps=0, Backend backend=Backend::seq) |
| Lanczos eigensolver for large sparse symmetric matrices. | |
| PowerResult | power_iteration (const Matrix &A, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Power iteration – finds the eigenvalue largest in absolute value. | |
| PowerResult | inverse_iteration (const Matrix &A, real sigma, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Inverse iteration – finds the eigenvalue closest to a shift sigma. | |
| PowerResult | rayleigh_iteration (const Matrix &A, const Vector &x0, real tol=1e-10, idx max_iter=50, Backend backend=default_backend) |
| Rayleigh quotient iteration – cubically convergent. | |
| Vector | expv (real t, const MatVecFn &matvec, idx n, const Vector &v, int m_max=30, real tol=1e-8) |
| Compute exp(t*A)*v via Krylov-Pade approximation (matrix-free) | |
| Vector | expv (real t, const SparseMatrix &A, const Vector &v, int m_max=30, real tol=1e-8) |
| Compute exp(t*A)*v via Krylov-Pade approximation (sparse matrix) | |
| LUResult | lu (const Matrix &A, Backend backend=lapack_backend) |
| LU factorization of a square matrix A with partial pivoting. | |
| void | lu_solve (const LUResult &f, const Vector &b, Vector &x) |
| Solve A*x = b using a precomputed LU factorization. | |
| void | lu_solve (const LUResult &f, const Matrix &B, Matrix &X) |
| Solve A*X = B for multiple right-hand sides. | |
| real | lu_det (const LUResult &f) |
| Determinant of A from its LU factorization. | |
| Matrix | lu_inv (const LUResult &f) |
| Inverse of A from its LU factorization. | |
| QRResult | qr (const Matrix &A, Backend backend=lapack_backend) |
| QR factorization of an mxn matrix A (m >= n) via Householder reflections. | |
| void | qr_solve (const QRResult &f, const Vector &b, Vector &x) |
| Solve the least-squares problem min ||A*x - b||_2. | |
| void | thomas (const Vector &a, const Vector &b, const Vector &c, const Vector &d, Vector &x, Backend backend=lapack_backend) |
| Thomas algorithm (LU for tridiagonal systems), O(n). | |
| SolverResult | cg (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Conjugate gradient solver for Ax = b. | |
| SolverResult | cg_matfree (MatVecFn matvec, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000) |
| Matrix-free conjugate gradient for Ax = b where A is SPD. | |
| SolverResult | gauss_seidel (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Gauss-Seidel iterative solver for Ax = b. | |
| SolverResult | jacobi (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Jacobi iterative solver for Ax = b. | |
| SolverResult | gmres (MatVecFn matvec, idx n, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30) |
| Restarted GMRES(restart) – matrix-free interface. | |
| SolverResult | gmres (const SparseMatrix &A, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30) |
| Restarted GMRES with a sparse (CSR) matrix. | |
| SolverResult | gmres (const Matrix &A, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30, Backend backend=default_backend) |
| Restarted GMRES with a dense matrix. | |
| void | sparse_matvec (const SparseMatrix &A, const Vector &x, Vector &y) |
| y = A * x | |
| SVDResult | svd (const Matrix &A, Backend backend=lapack_backend, real tol=1e-12, idx max_sweeps=100) |
| Full SVD of an mxn matrix. | |
| SVDResult | svd_truncated (const Matrix &A, idx k, Backend backend=default_backend, idx oversampling=10, Rng *rng=nullptr) |
| Randomized truncated SVD – top-k singular triplets. | |
| EulerSteps | euler (ODERhsFn f, Vector y0, ODEParams p={}) |
| RK4Steps | rk4 (ODERhsFn f, Vector y0, ODEParams p={}) |
| RK45Steps | rk45 (ODERhsFn f, Vector y0, ODEParams p={}) |
| VerletSteps | verlet (AccelFn accel, Vector q0, Vector v0, ODEParams p={}) |
| Yoshida4Steps | yoshida4 (AccelFn accel, Vector q0, Vector v0, ODEParams p={}) |
| RK4_2ndSteps | rk4_2nd (AccelFn accel, Vector q0, Vector v0, ODEParams p={}) |
| ODEResult | ode_euler (ODERhsFn f, Vector y0, ODEParams p={}, ObserverFn obs=nullptr) |
| Forward Euler, 1st-order, fixed step. | |
| ODEResult | ode_rk4 (ODERhsFn f, Vector y0, ODEParams p={}, ObserverFn obs=nullptr) |
| Classic 4th-order Runge-Kutta, fixed step. | |
| ODEResult | ode_rk45 (ODERhsFn f, Vector y0, ODEParams p={}, ObserverFn obs=nullptr) |
| Adaptive Dormand-Prince RK45 with FSAL and PI step-size control. | |
| SymplecticResult | ode_verlet (AccelFn accel, Vector q0, Vector v0, ODEParams p={}, SympObserverFn obs=nullptr) |
| Velocity Verlet, 2nd-order symplectic, 1 force evaluation per step. | |
| SymplecticResult | ode_yoshida4 (AccelFn accel, Vector q0, Vector v0, ODEParams p={}, SympObserverFn obs=nullptr) |
| Yoshida 4th-order symplectic, 3 force evaluations per step. | |
| SymplecticResult | ode_rk4_2nd (AccelFn accel, Vector q0, Vector v0, ODEParams p={}, SympObserverFn obs=nullptr) |
| RK4 for second-order systems q'' = accel(q), Nystrom form. | |
| template<typename T > | |
| void | laplacian_stencil_2d (const BasicVector< T > &x, BasicVector< T > &y, int N) |
| template<typename T > | |
| void | laplacian_stencil_2d_periodic (const BasicVector< T > &x, BasicVector< T > &y, int N) |
| template<typename T > | |
| void | laplacian_stencil_2d_4th (const BasicVector< T > &x, BasicVector< T > &y, int N) |
| real | sample_2d_periodic (const Vector &field, idx N, real h, real px, real py, real ox, real oy) |
| template<typename T , typename F > | |
| void | col_fiber_sweep (BasicVector< T > &data, int N, F &&f) |
| template<typename T , typename F > | |
| void | row_fiber_sweep (BasicVector< T > &data, int N, F &&f) |
| template<typename F > | |
| void | fill_grid (Vector &u, int N, double h, F &&f) |
| Series | row_slice (const Vector &u, int N, double h, int row) |
| Series | col_slice (const Vector &u, int N, double h, int col) |
| template<typename F > | |
| void | fill_grid (ScalarField2D &g, F &&f) |
| Series | row_slice (const ScalarField2D &g, int row) |
| Series | col_slice (const ScalarField2D &g, int col) |
| void | laplacian_stencil_2d_periodic (const ScalarField2D &x, ScalarField2D &y) |
| void | laplacian_stencil_2d_4th (const ScalarField2D &x, ScalarField2D &y) |
| real | sample_2d_periodic (const ScalarField2D &g, real px, real py, real ox=0.0, real oy=0.0) |
| void | neg_laplacian_3d (const Vector &x, Vector &y, int nx, int ny, int nz, double inv_dx2) |
| void | gradient_3d (const Grid3D &phi, Grid3D &gx, Grid3D &gy, Grid3D &gz) |
| void | divergence_3d (const Grid3D &fx, const Grid3D &fy, const Grid3D &fz, Grid3D &out) |
| void | curl_3d (const Grid3D &ax, const Grid3D &ay, const Grid3D &az, Grid3D &bx, Grid3D &by, Grid3D &bz) |
| void | apply_siam_style (Gnuplot &gp) |
| Apply SIAM-style theme to a raw Gnuplot pipe. | |
| void | set_loglog (Gnuplot &gp) |
| void | set_logx (Gnuplot &gp) |
| void | save_png (Gnuplot &gp, const std::string &filename, int w=900, int h=600) |
| template<IsODEProblem P> | |
| ODEResult | solve (const P &prob, const RK45 &alg, ObserverFn obs=nullptr) |
| template<IsODEProblem P> | |
| ODEResult | solve (const P &prob, const RK4 &alg, ObserverFn obs=nullptr) |
| template<IsODEProblem P> | |
| ODEResult | solve (const P &prob, const Euler &alg, ObserverFn obs=nullptr) |
| template<VecField F> | |
| void | solve (F &u, const BackwardEuler &alg) |
| template<VecField F, typename Observer > | |
| void | solve (F &u, const BackwardEuler &alg, Observer &&obs) |
| template<IsMCMCAlg A, typename MeasureFn , typename RNG > | |
| double | solve (const MCMCProblem &prob, const A &alg, MeasureFn &&measure, RNG &rng) |
| template<typename InCluster , typename Neighbors > | |
| ClusterResult | connected_components (int n_sites, InCluster &&in_cluster, Neighbors &&neighbors) |
| real | autocorr_time (const real *data, idx n, real c=6.0) |
Variables | |
| constexpr Backend | seq = Backend::seq |
| constexpr Backend | blocked = Backend::blocked |
| constexpr Backend | simd = Backend::simd |
| constexpr Backend | blas = Backend::blas |
| constexpr Backend | omp = Backend::omp |
| constexpr Backend | gpu = Backend::gpu |
| constexpr Backend | lapack = Backend::lapack |
| constexpr bool | has_blas |
| True when a BLAS/cblas library was found at configure time. | |
| constexpr bool | has_lapack |
| True when LAPACKE was found at configure time. | |
| constexpr bool | has_omp |
| True when OpenMP was found at configure time. | |
| constexpr bool | has_simd |
| True when a SIMD ISA was detected (AVX2 on x86-64, NEON on AArch64). | |
| constexpr Backend | default_backend |
| constexpr Backend | best_backend = default_backend |
| constexpr Backend | lapack_backend |
| constexpr real | pi = 3.14159265358979323846 |
| constexpr real | e = 2.71828182845904523536 |
| constexpr real | phi = 1.61803398874989484820 |
| Golden ratio. | |
| constexpr real | sqrt2 = 1.41421356237309504880 |
| constexpr real | sqrt3 = 1.73205080756887729353 |
| constexpr real | ln2 = 0.69314718055994530942 |
| constexpr real | inv_pi = 0.31830988618379067154 |
| 1/pi | |
| constexpr real | two_pi = 6.28318530717958647692 |
| 2pi | |
| constexpr real | half_pi = 1.57079632679489661923 |
| pi/2 | |
| using num::AccelFn = typedef std::function<void(const Vector& q, Vector& acc)> |
| using num::CVector = typedef BasicVector<cplx> |
Complex-valued dense vector (sequential; no GPU)
Definition at line 133 of file vector.hpp.
| using num::LinearSolver = typedef std::function<SolverResult(const Vector &rhs, Vector &x)> |
Callable that solves A*x = rhs in-place. rhs is passed by value so the solver can use it as a work vector without copying.
Definition at line 23 of file linear_solver.hpp.
| using num::MatVecFn = typedef std::function<void(const Vector &, Vector &)> |
| using num::ObserverFn = typedef std::function<void(real t, const Vector& y)> |
| using num::ODERhsFn = typedef std::function<void(real t, const Vector& y, Vector& dydt)> |
| using num::Point = typedef std::pair<double, double> |
| using num::ScalarFn = typedef std::function<real(real)> |
| using num::SympObserverFn = typedef std::function<void(real t, const Vector& q, const Vector& v)> |
| using num::Vector = typedef BasicVector<real> |
Real-valued dense vector with full backend dispatch (CPU + GPU)
Definition at line 130 of file vector.hpp.
| using num::VectorFn = typedef std::function<void(real, real *, real *)> |
|
strong |
Selects which backend handles a linalg operation.
Definition at line 19 of file policy.hpp.
Adaptive Simpson quadrature.
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| tol | Absolute error tolerance |
| max_depth | Maximum recursion depth |
Definition at line 119 of file quadrature.cpp.
| void num::add | ( | const Vector & | x, |
| const Vector & | y, | ||
| Vector & | z, | ||
| Backend | b = default_backend |
||
| ) |
z = x + y
Definition at line 50 of file vector.cpp.
References num::cuda::add(), num::backends::seq::add(), gpu, num::BasicVector< T >::gpu_data(), and num::BasicVector< T >::size().
|
inline |
Integrated autocorrelation time tau_int from a stored time series.
Uses the automatic windowing criterion (Madras & Sokal 1988): accumulate C(t)/C(0) until W > c*tau_int (c = 6 default). Returns tau_int >= 0.5; returns 0.5 on failure.
| data | Pointer to time series of length n |
| n | Length of the series |
| c | Window parameter (default 6) |
Definition at line 6 of file stats.cpp.
References e.
| void num::axpy | ( | real | alpha, |
| const Vector & | x, | ||
| Vector & | y, | ||
| Backend | b = default_backend |
||
| ) |
y += alpha * x
Definition at line 58 of file vector.cpp.
References num::backends::blas::axpy(), num::backends::gpu::axpy(), num::backends::omp::axpy(), num::backends::seq::axpy(), blas, blocked, gpu, lapack, omp, seq, and simd.
Referenced by cg(), cg_matfree(), num::pde::diffusion_step_2d(), num::pde::diffusion_step_2d_4th_dirichlet(), num::pde::diffusion_step_2d_dirichlet(), expv(), lanczos(), and num::kernel::subspace::mgs_orthogonalize().
| void num::banded_gemv | ( | real | alpha, |
| const BandedMatrix & | A, | ||
| const Vector & | x, | ||
| real | beta, | ||
| Vector & | y, | ||
| Backend | backend = default_backend |
||
| ) |
Banded matrix-vector product y = alpha*A*x + beta*y.
Generalized matrix-vector multiply with scaling factors.
| alpha | Scalar multiplier for A*x |
| A | Banded matrix |
| x | Input vector |
| beta | Scalar multiplier for y |
| y | Input/output vector |
| backend | Execution backend (Backend::gpu uses CUDA path when available) |
Definition at line 314 of file banded.cpp.
References beta(), num::BasicVector< T >::data(), num::BandedMatrix::data(), gpu, num::BandedMatrix::kl(), num::BandedMatrix::ku(), num::BandedMatrix::ldab(), num::BandedMatrix::size(), and num::BasicVector< T >::size().
Referenced by banded_matvec().
| BandedSolverResult num::banded_lu | ( | BandedMatrix & | A, |
| idx * | ipiv | ||
| ) |
LU factorization of banded matrix with partial pivoting.
Computes A = P * L * U where:
The factorization is performed in-place. After factorization:
| A | Banded matrix (modified in place with LU factors) |
| ipiv | Pivot indices (size n, allocated by caller) |
Complexity: O(n * kl * (kl + ku)) operations
Definition at line 135 of file banded.cpp.
References num::BandedMatrix::data(), num::BandedMatrix::kl(), num::BandedMatrix::ku(), num::BandedMatrix::ldab(), and num::BandedMatrix::size().
Referenced by banded_solve().
| void num::banded_lu_solve | ( | const BandedMatrix & | A, |
| const idx * | ipiv, | ||
| Vector & | b | ||
| ) |
Solve banded system using precomputed LU factorization.
Solves A*x = b where A has been factored by banded_lu(). The solution overwrites the right-hand side vector.
| A | LU-factored banded matrix from banded_lu() |
| ipiv | Pivot indices from banded_lu() |
| b | Right-hand side (overwritten with solution) |
Complexity: O(n * (kl + ku)) operations
Definition at line 209 of file banded.cpp.
References num::BasicVector< T >::data(), num::BandedMatrix::data(), num::BandedMatrix::kl(), num::BandedMatrix::ku(), num::BandedMatrix::ldab(), num::BandedMatrix::size(), and num::BasicVector< T >::size().
Referenced by banded_rcond(), and banded_solve().
| void num::banded_lu_solve_multi | ( | const BandedMatrix & | A, |
| const idx * | ipiv, | ||
| real * | B, | ||
| idx | nrhs | ||
| ) |
Solve multiple right-hand sides using LU factorization.
Solves A*X = B where B has nrhs columns stored contiguously.
| A | LU-factored banded matrix |
| ipiv | Pivot indices |
| B | Right-hand sides (n x nrhs, column-major, overwritten with solutions) |
| nrhs | Number of right-hand sides |
Definition at line 249 of file banded.cpp.
References num::BandedMatrix::data(), num::BandedMatrix::kl(), num::BandedMatrix::ku(), num::BandedMatrix::ldab(), and num::BandedMatrix::size().
| void num::banded_matvec | ( | const BandedMatrix & | A, |
| const Vector & | x, | ||
| Vector & | y, | ||
| Backend | backend = default_backend |
||
| ) |
Banded matrix-vector product y = A*x.
Optimized for cache efficiency with loop ordering that accesses the band storage in column-major order.
| A | Banded matrix |
| x | Input vector |
| y | Output vector (overwritten) |
| backend | Execution backend (Backend::gpu uses CUDA path when available) |
Definition at line 307 of file banded.cpp.
References banded_gemv().
| real num::banded_norm1 | ( | const BandedMatrix & | A | ) |
Compute 1-norm of banded matrix.
| A | Banded matrix |
Definition at line 358 of file banded.cpp.
References num::BandedMatrix::data(), num::BandedMatrix::kl(), num::BandedMatrix::ku(), num::BandedMatrix::ldab(), and num::BandedMatrix::size().
| real num::banded_rcond | ( | const BandedMatrix & | A, |
| const idx * | ipiv, | ||
| real | anorm | ||
| ) |
Estimate reciprocal condition number of banded matrix.
Uses 1-norm condition number estimation after LU factorization.
| A | LU-factored banded matrix |
| ipiv | Pivot indices |
| anorm | 1-norm of original matrix (before factorization) |
Definition at line 374 of file banded.cpp.
References banded_lu_solve(), and num::BandedMatrix::size().
| BandedSolverResult num::banded_solve | ( | const BandedMatrix & | A, |
| const Vector & | b, | ||
| Vector & | x | ||
| ) |
Solve banded system Ax = b (convenience function)
Performs LU factorization and solve in one call. For solving multiple systems with the same matrix, use banded_lu() and banded_lu_solve() separately to avoid redundant factorization.
| A | Banded matrix (NOT modified - internal copy made) |
| b | Right-hand side |
| x | Solution vector (output) |
Definition at line 286 of file banded.cpp.
References banded_lu(), banded_lu_solve(), num::BandedMatrix::size(), num::BasicVector< T >::size(), and num::BandedSolverResult::success.
B(a, b) – beta function.
Definition at line 248 of file math.hpp.
Referenced by num::kernel::subspace::arnoldi_step(), banded_gemv(), num::markov::boltzmann_accept(), cg(), cg_matfree(), expv(), gmres(), lanczos(), num::markov::make_boltzmann_table(), num::backends::blas::matadd(), num::backends::omp::matadd(), num::backends::seq::matadd(), matadd(), num::markov::metropolis_sweep(), num::backends::seq::svd(), and num::markov::umbrella_sweep().
Brent's method (bisection + secant + inverse quadratic interpolation)
Guaranteed to converge if f(a) and f(b) have opposite signs. Superlinear convergence near smooth roots. Preferred method when a reliable bracket is available.
| f | Function |
| a,b | Bracket: f(a) and f(b) must have opposite signs |
| tol | Convergence tolerance |
| max_iter | Maximum iterations |
Definition at line 62 of file roots.cpp.
References e.
| SolverResult num::cg | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Conjugate gradient solver for Ax = b.
| A | Symmetric positive definite matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Backend for internal matvec/dot/axpy/scale (default: default_backend) |
Definition at line 8 of file cg.cpp.
References axpy(), beta(), num::Matrix::cols(), dot(), e, gpu, num::SolverResult::iterations, matvec(), num::Matrix::rows(), scale(), num::BasicVector< T >::size(), num::BasicVector< T >::to_cpu(), num::cuda::to_device(), and num::BasicVector< T >::to_gpu().
Referenced by num::FieldSolver::solve_var_poisson(), and thomas().
| SolverResult num::cg_matfree | ( | MatVecFn | matvec, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000 |
||
| ) |
Matrix-free conjugate gradient for Ax = b where A is SPD.
| matvec | Callable computing y = A*x |
| b | Right-hand side |
| x | Initial guess (modified in-place -> solution) |
| tol | Convergence tolerance on residual norm |
| max_iter | Maximum CG iterations |
Definition at line 78 of file cg.cpp.
References axpy(), beta(), dot(), e, num::SolverResult::iterations, scale(), seq, and num::BasicVector< T >::size().
Referenced by num::pde::make_cg_solver(), num::FieldSolver::solve_poisson(), and num::FieldSolver::solve_var_poisson().
| void num::col_fiber_sweep | ( | BasicVector< T > & | data, |
| int | N, | ||
| F && | f | ||
| ) |
For each column j in [0,N), extract the x-direction fiber data[0..N-1, j] into a std::vector<T>, call f(fiber), then write back.
f must have signature: void(std::vector<T>&) and modifies in-place.
Use for ADI / Crank-Nicolson sweeps along x.
Definition at line 168 of file stencil.hpp.
Referenced by num::CrankNicolsonADI::sweep().
|
inline |
Definition at line 242 of file stencil.hpp.
References col_slice(), num::ScalarField2D::h(), num::ScalarField2D::N(), and num::ScalarField2D::vec().
Extract column col of an NxN field as a (y, value) Series; node i sits at y = (i+1)*h.
Definition at line 222 of file stencil.hpp.
References num::Series::store().
Referenced by col_slice().
| ClusterResult num::connected_components | ( | int | n_sites, |
| InCluster && | in_cluster, | ||
| Neighbors && | neighbors | ||
| ) |
BFS connected-component labelling with pre-allocated flat queue (no heap per call).
| n_sites | Total number of sites |
| in_cluster | bool(int i) – include site i? |
| neighbors | void(int i, auto&& visit) – call visit(nb) per neighbor of i |
Definition at line 38 of file connected_components.hpp.
References num::ClusterResult::id, num::ClusterResult::largest_id, num::ClusterResult::largest_size, and num::ClusterResult::sizes.
|
inline |
Compute the central-difference curl(A) of a vector field. One-sided differences at domain boundaries. bx, by, bz must already be allocated with the same dimensions as ax, ay, az.
Definition at line 342 of file stencil.hpp.
References num::Grid3D::dx(), num::Grid3D::nx(), num::Grid3D::ny(), and num::Grid3D::nz().
Referenced by num::FieldSolver::curl().
|
inline |
Compute the central-difference divergence of a vector field (fx, fy, fz). One-sided differences at domain boundaries. out must already be allocated with the same dimensions.
Definition at line 320 of file stencil.hpp.
References num::Grid3D::dx(), num::Grid3D::nx(), num::Grid3D::ny(), and num::Grid3D::nz().
Referenced by num::FieldSolver::divergence().
Conjugate inner product <x, y> = Sigma conj(x_i) * y_i.
Definition at line 128 of file vector.cpp.
References num::BasicVector< T >::size().
| real num::dot | ( | const Vector & | x, |
| const Vector & | y, | ||
| Backend | b = default_backend |
||
| ) |
dot product
Definition at line 79 of file vector.cpp.
References blas, blocked, num::backends::blas::dot(), num::backends::gpu::dot(), num::backends::omp::dot(), num::backends::seq::dot(), gpu, lapack, omp, seq, and simd.
Referenced by cg(), cg_matfree(), num::mpi::dot(), inverse_iteration(), lanczos(), num::kernel::subspace::mgs_orthogonalize(), num::mpi::norm(), power_iteration(), and rayleigh_iteration().
| EigenResult num::eig_sym | ( | const Matrix & | A, |
| real | tol = 1e-12, |
||
| idx | max_sweeps = 100, |
||
| Backend | backend = lapack_backend |
||
| ) |
Full eigendecomposition of a real symmetric matrix.
The rotation accumulation loop is parallelised when backend == Backend::omp.
| A | Real symmetric nxn matrix (upper/lower triangle used) |
| tol | Jacobi convergence tolerance (ignored for Backend::lapack) |
| max_sweeps | Jacobi sweep cap (ignored for Backend::lapack) |
| backend | Backend::lapack uses LAPACKE_dsyevd (default when available). Backend::omp parallelises the Jacobi rotation loop. Backend::seq uses our cyclic Jacobi implementation. |
Definition at line 17 of file eig.cpp.
References num::backends::lapack::eig_sym(), num::backends::omp::eig_sym(), num::backends::seq::eig_sym(), lapack, and omp.
Referenced by lanczos().
| EulerSteps num::euler | ( | ODERhsFn | f, |
| Vector | y0, | ||
| ODEParams | p = {} |
||
| ) |
Definition at line 309 of file ode.cpp.
Referenced by ode_euler().
| Vector num::expv | ( | real | t, |
| const MatVecFn & | matvec, | ||
| idx | n, | ||
| const Vector & | v, | ||
| int | m_max = 30, |
||
| real | tol = 1e-8 |
||
| ) |
Compute exp(t*A)*v via Krylov-Pade approximation (matrix-free)
| t | Scalar multiplier on A |
| matvec | Callable y = A*x |
| n | Dimension of the state space (size of v) |
| v | Input vector |
| m_max | Maximum Krylov dimension (default 30) |
| tol | Breakdown tolerance for Arnoldi (default 1e-8) |
Definition at line 106 of file expv.cpp.
References axpy(), beta(), e, matvec(), num::kernel::subspace::mgs_orthogonalize(), norm(), and scale().
Referenced by expv().
| Vector num::expv | ( | real | t, |
| const SparseMatrix & | A, | ||
| const Vector & | v, | ||
| int | m_max = 30, |
||
| real | tol = 1e-8 |
||
| ) |
Compute exp(t*A)*v via Krylov-Pade approximation (sparse matrix)
| t | Scalar multiplier on A |
| A | Sparse matrix in CSR format |
| v | Input vector |
| m_max | Maximum Krylov dimension (default 30) |
| tol | Breakdown tolerance for Arnoldi (default 1e-8) |
Definition at line 173 of file expv.cpp.
References expv(), num::SparseMatrix::n_rows(), and sparse_matvec().
| void num::fill_grid | ( | ScalarField2D & | g, |
| F && | f | ||
| ) |
Definition at line 234 of file stencil.hpp.
References fill_grid(), num::ScalarField2D::h(), num::ScalarField2D::N(), and num::ScalarField2D::vec().
| void num::fill_grid | ( | Vector & | u, |
| int | N, | ||
| double | h, | ||
| F && | f | ||
| ) |
Initialize an NxN field (row-major, node (i,j) at ((i+1)*h, (j+1)*h)) from a callable f(x, y) -> real.
Definition at line 200 of file stencil.hpp.
Referenced by fill_grid().
Gauss-Legendre quadrature (exact for polynomials up to degree 2p-1)
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| p | Number of quadrature points (1 to 5 supported) |
Definition at line 71 of file quadrature.cpp.
| SolverResult num::gauss_seidel | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Gauss-Seidel iterative solver for Ax = b.
Updates each component x[i] in-place using the latest values of all other components. Converges for strictly diagonally dominant or symmetric positive definite A.
With Backend::omp the residual computation is parallelised; the update sweep remains sequential to preserve convergence properties.
| A | Square matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Execution backend (default: default_backend) |
Definition at line 13 of file gauss_seidel.cpp.
References num::Matrix::cols(), e, num::Matrix::rows(), and num::BasicVector< T >::size().
| SolverResult num::gmres | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30, |
||
| Backend | backend = default_backend |
||
| ) |
Restarted GMRES with a dense matrix.
| backend | Backend for the internal matvec at each Arnoldi step |
Definition at line 142 of file krylov.cpp.
References num::Matrix::cols(), gmres(), matvec(), and num::Matrix::rows().
| SolverResult num::gmres | ( | const SparseMatrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30 |
||
| ) |
Restarted GMRES with a sparse (CSR) matrix.
Definition at line 128 of file krylov.cpp.
References gmres(), num::SparseMatrix::n_cols(), num::SparseMatrix::n_rows(), and sparse_matvec().
| SolverResult num::gmres | ( | MatVecFn | matvec, |
| idx | n, | ||
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30 |
||
| ) |
Restarted GMRES(restart) – matrix-free interface.
Works for any invertible A (symmetric or non-symmetric, indefinite). The Krylov subspace is restarted every restart steps to bound memory.
| matvec | Callable computing y = A*x |
| n | System dimension |
| b | Right-hand side |
| x | Initial guess (modified in-place -> solution) |
| tol | Convergence tolerance on residual norm |
| max_iter | Maximum total matrix-vector products |
| restart | Krylov subspace size before restart (default 30) |
Definition at line 12 of file krylov.cpp.
References num::kernel::subspace::arnoldi_step(), beta(), e, num::kernel::subspace::make_op(), and num::BasicVector< T >::size().
Compute the central-difference gradient of a scalar field./// One-sided differences at domain boundaries. gx, gy, gz must already be allocated with the same dimensions as phi.
Definition at line 302 of file stencil.hpp.
References phi.
Referenced by num::FieldSolver::gradient().
|
inline |
| PowerResult num::inverse_iteration | ( | const Matrix & | A, |
| real | sigma, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Inverse iteration – finds the eigenvalue closest to a shift sigma.
Factorizes (A - sigmaI) once then solves repeatedly.
| A | Square matrix (symmetric recommended) |
| sigma | Shift – should be near the target eigenvalue |
| tol | Tolerance on eigenvalue change between iterations |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec and dot |
Definition at line 49 of file power.cpp.
References num::Matrix::cols(), dot(), e, lu(), lu_solve(), matvec(), num::detail::normalise(), and num::Matrix::rows().
|
constexprnoexcept |
Compute x^N at compile time via repeated squaring.
| N | Non-negative integer exponent (must be a compile-time constant). |
| T | Arithmetic type (float, double, int, ...). |
Definition at line 34 of file integer_pow.hpp.
References ipow().
Referenced by ipow().
| SolverResult num::jacobi | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Jacobi iterative solver for Ax = b.
Updates all components simultaneously using only values from the previous iteration. Converges for strictly diagonally dominant A. Trivially parallelisable with Backend::omp.
| A | Square matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Execution backend (default: default_backend) |
Definition at line 7 of file jacobi.cpp.
References num::Matrix::cols(), e, num::Matrix::rows(), and num::BasicVector< T >::size().
| LanczosResult num::lanczos | ( | MatVecFn | matvec, |
| idx | n, | ||
| idx | k, | ||
| real | tol = 1e-10, |
||
| idx | max_steps = 0, |
||
| Backend | backend = Backend::seq |
||
| ) |
Lanczos eigensolver for large sparse symmetric matrices.
| matvec | Callable computing w = A*v (matrix-free) |
| n | Dimension of the problem |
| k | Number of eigenvalues requested (k << n) |
| tol | Convergence on ||A*u - lambda*u|| for each Ritz pair |
| max_steps | Maximum Lanczos steps (default: min(3k, n)) |
| backend | Backend for reorthogonalisation inner products (default: seq) |
Definition at line 32 of file lanczos.cpp.
References axpy(), beta(), dot(), e, eig_sym(), matvec(), num::kernel::subspace::mgs_orthogonalize(), num::EigenResult::values, and num::EigenResult::vectors.
| void num::laplacian_stencil_2d | ( | const BasicVector< T > & | x, |
| BasicVector< T > & | y, | ||
| int | N | ||
| ) |
Compute y[i,j] = x[i+1,j] + x[i-1,j] + x[i,j+1] + x[i,j-1] - 4*x[i,j]/// Dirichlet: out-of-bounds neighbors contribute 0. Works for T = real or cplx.
Definition at line 35 of file stencil.hpp.
Referenced by num::pde::diffusion_step_2d_dirichlet().
| void num::laplacian_stencil_2d_4th | ( | const BasicVector< T > & | x, |
| BasicVector< T > & | y, | ||
| int | N | ||
| ) |
4th-order accurate 2D Laplacian: 13-point cross stencil, Dirichlet BCs.
Sums the standard 4th-order 1D stencil independently along each axis:
\[ y_{i,j} = \tfrac{1}{12}\bigl( -x_{i-2,j} + 16x_{i-1,j} - 30x_{i,j} + 16x_{i+1,j} - x_{i+2,j} -x_{i,j-2} + 16x_{i,j-1} + 16x_{i,j+1} - x_{i,j+2} \bigr) \]
Out-of-bounds neighbors are treated as 0 (Dirichlet). Divide by h² to recover the discrete Laplacian: \(\nabla^2 u \approx y/h^2\). Truncation error: \(O(h^4)\).
Definition at line 95 of file stencil.hpp.
Referenced by num::pde::diffusion_step_2d_4th_dirichlet(), and laplacian_stencil_2d_4th().
|
inline |
Definition at line 250 of file stencil.hpp.
References laplacian_stencil_2d_4th(), num::ScalarField2D::N(), and num::ScalarField2D::vec().
| void num::laplacian_stencil_2d_periodic | ( | const BasicVector< T > & | x, |
| BasicVector< T > & | y, | ||
| int | N | ||
| ) |
Same as laplacian_stencil_2d but with periodic (wrap-around) boundaries.
The j boundaries (j=0 and j=N-1) are peeled out of the inner loop so the interior range j=1..N-2 contains no modulo arithmetic and is auto-vectorizable. Row wrapping (i+/-1) uses precomputed indices.
Definition at line 59 of file stencil.hpp.
References num::BasicVector< T >::data().
Referenced by num::pde::diffusion_step_2d(), and laplacian_stencil_2d_periodic().
|
inline |
Definition at line 246 of file stencil.hpp.
References laplacian_stencil_2d_periodic(), num::ScalarField2D::N(), and num::ScalarField2D::vec().
| LUResult num::lu | ( | const Matrix & | A, |
| Backend | backend = lapack_backend |
||
| ) |
LU factorization of a square matrix A with partial pivoting.
Computes P, L, U such that P*A = L*U where: P = permutation matrix encoded in piv L = unit lower triangular (diagonal = 1, stored below diag of LU) U = upper triangular (stored on and above diag of LU)
| backend | Backend::lapack uses LAPACKE_dgetrf (default when available). Backend::seq uses our Doolittle implementation. |
Definition at line 17 of file lu.cpp.
References lapack, num::backends::lapack::lu(), and num::backends::seq::lu().
Referenced by inverse_iteration(), and rayleigh_iteration().
Determinant of A from its LU factorization.
det(A) = det(P)^{-1} * prod(U[i,i]) = (-1)^{swaps} * prod(U[i,i])
Overflow is possible for large n – use log-determinant for stability.
Definition at line 66 of file lu.cpp.
References num::LUResult::LU, num::LUResult::piv, and num::Matrix::rows().
Inverse of A from its LU factorization.
Solves A * X = I column by column. O(n^3) – only use when the full inverse is genuinely needed (prefer lu_solve for specific right-hand sides).
Definition at line 80 of file lu.cpp.
References e, num::LUResult::LU, lu_solve(), and num::Matrix::rows().
Solve A*X = B for multiple right-hand sides.
Each column of B is solved independently using the same factorization. B and X are nxnrhs matrices (column-major access pattern).
Definition at line 51 of file lu.cpp.
References num::Matrix::cols(), lu_solve(), and num::Matrix::rows().
Solve A*x = b using a precomputed LU factorization.
Three steps:
Definition at line 28 of file lu.cpp.
References num::LUResult::LU, num::LUResult::piv, and num::Matrix::rows().
Referenced by inverse_iteration(), lu_inv(), lu_solve(), and rayleigh_iteration().
| void num::matadd | ( | real | alpha, |
| const Matrix & | A, | ||
| real | beta, | ||
| const Matrix & | B, | ||
| Matrix & | C, | ||
| Backend | b = default_backend |
||
| ) |
C = alpha*A + beta*B.
Definition at line 145 of file matrix.cpp.
References beta(), blas, blocked, gpu, lapack, num::backends::blas::matadd(), num::backends::omp::matadd(), num::backends::seq::matadd(), omp, seq, and simd.
| void num::matmul | ( | const Matrix & | A, |
| const Matrix & | B, | ||
| Matrix & | C, | ||
| Backend | b = default_backend |
||
| ) |
C = A * B.
Definition at line 95 of file matrix.cpp.
References blas, blocked, gpu, lapack, num::backends::blas::matmul(), num::backends::gpu::matmul(), num::backends::omp::matmul(), num::backends::seq::matmul(), num::backends::simd::matmul(), num::backends::seq::matmul_blocked(), omp, seq, and simd.
Referenced by svd_truncated().
C = A * B (cache-blocked)
Divides A, B, C into BLOCKxBLOCK tiles so the working set fits in L2 cache.
| block_size | Tile edge length (default 64). |
Definition at line 171 of file matrix.cpp.
References num::backends::seq::matmul_blocked().
| void num::matmul_register_blocked | ( | const Matrix & | A, |
| const Matrix & | B, | ||
| Matrix & | C, | ||
| idx | block_size = 64, |
||
| idx | reg_size = 4 |
||
| ) |
C = A * B (register-blocked)
Extends cache blocking with a REGxREG register tile inside each cache tile.
| block_size | Cache tile edge (default 64). |
| reg_size | Register tile edge (default 4). |
Definition at line 178 of file matrix.cpp.
References num::backends::seq::matmul_register_blocked().
C = A * B (SIMD-accelerated)
Dispatches at compile time: AVX-256 + FMA on x86, NEON on AArch64, falls back to matmul_blocked if neither is available.
Definition at line 186 of file matrix.cpp.
References num::backends::simd::matmul().
| void num::matvec | ( | const Matrix & | A, |
| const Vector & | x, | ||
| Vector & | y, | ||
| Backend | b = default_backend |
||
| ) |
y = A * x
Definition at line 120 of file matrix.cpp.
References blas, blocked, gpu, lapack, num::backends::blas::matvec(), num::backends::gpu::matvec(), num::backends::omp::matvec(), num::backends::simd::matvec(), num::backends::seq::matvec(), omp, seq, and simd.
Referenced by num::kernel::subspace::DenseOp::apply(), cg(), expv(), gmres(), inverse_iteration(), lanczos(), num::pde::make_cg_solver(), power_iteration(), rayleigh_iteration(), num::FieldSolver::solve_poisson(), and num::FieldSolver::solve_var_poisson().
y = A * x (SIMD-accelerated)
Definition at line 190 of file matrix.cpp.
References num::backends::simd::matvec().
|
inline |
Compute the negative Laplacian: y = -Lap(x) = (6x[i,j,k] - sum of 6 nbrs) / dx^2 Dirichlet BC: boundary nodes satisfy y[bdry] = x[bdry] (identity row, so that the system is SPD when used with a b=0 boundary RHS).
x and y are flat vectors in Grid3D layout: idx = k*ny*nx + j*nx + i.
Definition at line 273 of file stencil.hpp.
Referenced by num::FieldSolver::solve_poisson().
Euclidean norm sqrt(Sigma |v_i|^2)
Definition at line 135 of file vector.cpp.
References num::BasicVector< T >::size().
| real num::norm | ( | const Vector & | x, |
| Backend | b = default_backend |
||
| ) |
Euclidean norm.
Definition at line 97 of file vector.cpp.
References blas, blocked, gpu, lapack, num::backends::blas::norm(), num::backends::gpu::norm(), num::backends::seq::norm(), omp, seq, and simd.
Referenced by expv(), num::kernel::subspace::mgs_orthogonalize(), num::kernel::subspace::mgs_orthogonalize(), and num::Histogram::pdf().
| ODEResult num::ode_euler | ( | ODERhsFn | f, |
| Vector | y0, | ||
| ODEParams | p = {}, |
||
| ObserverFn | obs = nullptr |
||
| ) |
| ODEResult num::ode_rk4 | ( | ODERhsFn | f, |
| Vector | y0, | ||
| ODEParams | p = {}, |
||
| ObserverFn | obs = nullptr |
||
| ) |
| ODEResult num::ode_rk45 | ( | ODERhsFn | f, |
| Vector | y0, | ||
| ODEParams | p = {}, |
||
| ObserverFn | obs = nullptr |
||
| ) |
| SymplecticResult num::ode_rk4_2nd | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {}, |
||
| SympObserverFn | obs = nullptr |
||
| ) |
| SymplecticResult num::ode_verlet | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {}, |
||
| SympObserverFn | obs = nullptr |
||
| ) |
| SymplecticResult num::ode_yoshida4 | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {}, |
||
| SympObserverFn | obs = nullptr |
||
| ) |
Yoshida 4th-order symplectic, 3 force evaluations per step.
Definition at line 361 of file ode.cpp.
References yoshida4().
|
constexprnoexcept |
Definition at line 72 of file small_matrix.hpp.
|
constexprnoexcept |
Definition at line 68 of file small_matrix.hpp.
| PowerResult num::power_iteration | ( | const Matrix & | A, |
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Power iteration – finds the eigenvalue largest in absolute value.
| A | Square matrix (need not be symmetric) |
| tol | Tolerance on eigenvalue change between iterations |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec and dot |
Definition at line 10 of file power.cpp.
References num::Matrix::cols(), dot(), e, matvec(), num::detail::normalise(), and num::Matrix::rows().
| QRResult num::qr | ( | const Matrix & | A, |
| Backend | backend = lapack_backend |
||
| ) |
QR factorization of an mxn matrix A (m >= n) via Householder reflections.
Each Householder step k:
After min(m-1, n) steps, the matrix is upper triangular. Q is reconstructed by accumulating the reflectors in reverse order.
| backend | Backend::lapack uses LAPACKE_dgeqrf (default when available). Backend::seq uses our Householder implementation. |
Definition at line 15 of file qr.cpp.
References lapack, num::backends::lapack::qr(), and num::backends::seq::qr().
Referenced by svd_truncated().
Solve the least-squares problem min ||A*x - b||_2.
Algorithm:
For square non-singular A this returns the exact solution. For overdetermined A (m > n) this returns the minimum-residual solution.
Definition at line 26 of file qr.cpp.
References num::Matrix::cols(), num::QRResult::Q, num::QRResult::R, and num::Matrix::rows().
| PowerResult num::rayleigh_iteration | ( | const Matrix & | A, |
| const Vector & | x0, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 50, |
||
| Backend | backend = default_backend |
||
| ) |
Rayleigh quotient iteration – cubically convergent.
Updates the shift sigma = v^T*A*v at every step -> fresh LU each iteration.
| A | Symmetric matrix |
| x0 | Starting vector (determines which eigenvalue is found) |
| tol | Tolerance on residual ||A*v - lambda*v|| |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec, dot, axpy, norm |
Definition at line 98 of file power.cpp.
References num::Matrix::cols(), dot(), lu(), lu_solve(), matvec(), num::detail::normalise(), num::Matrix::rows(), num::LUResult::singular, and num::BasicVector< T >::size().
Definition at line 315 of file ode.cpp.
Referenced by ode_rk45().
| RK4_2ndSteps num::rk4_2nd | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {} |
||
| ) |
Definition at line 325 of file ode.cpp.
Referenced by ode_rk4_2nd().
|
inline |
Uniform integer in [lo, hi] (inclusive on both ends).
Definition at line 301 of file math.hpp.
References num::Rng::engine.
Normal (Gaussian) sample with given mean and standard deviation.
Definition at line 296 of file math.hpp.
References num::Rng::engine.
Referenced by svd_truncated().
Romberg integration (Richardson extrapolation on trapezoidal rule)
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| tol | Convergence tolerance |
| max_levels | Maximum refinement levels |
Definition at line 125 of file quadrature.cpp.
| void num::row_fiber_sweep | ( | BasicVector< T > & | data, |
| int | N, | ||
| F && | f | ||
| ) |
For each row i in [0,N), extract the y-direction fiber data[i, 0..N-1] into a std::vector<T>, call f(fiber), then write back.
Use for ADI / Crank-Nicolson sweeps along y.
Definition at line 184 of file stencil.hpp.
Referenced by num::CrankNicolsonADI::sweep().
|
inline |
Definition at line 238 of file stencil.hpp.
References num::ScalarField2D::h(), num::ScalarField2D::N(), row_slice(), and num::ScalarField2D::vec().
Extract row row of an NxN field as an (x, value) Series; node j sits at x = (j+1)*h.
Definition at line 211 of file stencil.hpp.
References num::Series::store().
Referenced by row_slice().
|
inline |
Definition at line 254 of file stencil.hpp.
References num::ScalarField2D::h(), num::ScalarField2D::N(), sample_2d_periodic(), and num::ScalarField2D::vec().
|
inline |
Bilinear interpolation on a periodic NxN grid with configurable stagger offset.
field[i,j] is defined at physical position ((i + ox/h)*h, (j + oy/h)*h). Returns the interpolated field value at physical point (px, py).
| ox | x-axis origin offset in physical units (0 for unstaggered, h/2 for v-face) |
| oy | y-axis origin offset in physical units (0 for unstaggered, h/2 for u-face) |
MAC grid usage:
\[ \text{interp\_u}(px,py) = \texttt{sample\_2d\_periodic}(u, N, h,\; px, py,\; 0,\; h/2) \]
\[ \text{interp\_v}(px,py) = \texttt{sample\_2d\_periodic}(v, N, h,\; px, py,\; h/2,\; 0) \]
Definition at line 137 of file stencil.hpp.
Referenced by sample_2d_periodic().
|
inline |
| void num::scale | ( | Vector & | v, |
| real | alpha, | ||
| Backend | b = default_backend |
||
| ) |
v *= alpha
Definition at line 29 of file vector.cpp.
References blas, blocked, gpu, lapack, omp, num::backends::blas::scale(), num::backends::gpu::scale(), num::backends::omp::scale(), num::backends::seq::scale(), seq, and simd.
Referenced by num::kernel::subspace::arnoldi_step(), cg(), cg_matfree(), expv(), and num::VectorField3D::scale().
Simpson's 1/3 rule with n panels (n must be even)
| backend | Backend::omp parallelises the panel sum |
Definition at line 26 of file quadrature.cpp.
| double num::solve | ( | const MCMCProblem & | prob, |
| const A & | alg, | ||
| MeasureFn && | measure, | ||
| RNG & | rng | ||
| ) |
Run equilibration + measurement sweeps; return the mean of measure() over measurement sweeps. The rng is passed by the caller so its state persists across successive solve() calls (e.g. a temperature loop).
Definition at line 93 of file solve.hpp.
References num::MCMCProblem::accept_prob, num::markov::metropolis_sweep_prob(), num::MCMCProblem::n_sites, and num::MCMCProblem::propose.
| ODEResult num::solve | ( | const P & | prob, |
| const Euler & | alg, | ||
| ObserverFn | obs = nullptr |
||
| ) |
Definition at line 71 of file solve.hpp.
References ode_euler().
| ODEResult num::solve | ( | const P & | prob, |
| const RK4 & | alg, | ||
| ObserverFn | obs = nullptr |
||
| ) |
| ODEResult num::solve | ( | const P & | prob, |
| const RK45 & | alg, | ||
| ObserverFn | obs = nullptr |
||
| ) |
Definition at line 59 of file solve.hpp.
References num::RK45::atol, num::RK45::h, num::RK45::max_steps, ode_rk45(), num::RK45::rtol, and num::ODEParams::t0.
| void num::solve | ( | F & | u, |
| const BackwardEuler & | alg | ||
| ) |
Definition at line 78 of file solve.hpp.
References num::ode::advance(), and num::BackwardEuler::solver.
| void num::solve | ( | F & | u, |
| const BackwardEuler & | alg, | ||
| Observer && | obs | ||
| ) |
Definition at line 83 of file solve.hpp.
References num::ode::advance(), and num::BackwardEuler::solver.
| void num::sparse_matvec | ( | const SparseMatrix & | A, |
| const Vector & | x, | ||
| Vector & | y | ||
| ) |
y = A * x
Definition at line 124 of file sparse.cpp.
References num::SparseMatrix::col_idx(), num::SparseMatrix::n_cols(), num::SparseMatrix::n_rows(), num::SparseMatrix::row_ptr(), num::BasicVector< T >::size(), and num::SparseMatrix::values().
Referenced by num::kernel::subspace::SparseOp::apply(), expv(), gmres(), and num::pde::make_cg_solver().
| SVDResult num::svd | ( | const Matrix & | A, |
| Backend | backend = lapack_backend, |
||
| real | tol = 1e-12, |
||
| idx | max_sweeps = 100 |
||
| ) |
Full SVD of an mxn matrix.
Returns the economy SVD: r = min(m,n), U is mxr, S has r elements, Vt is rxn.
| A | Input matrix (not modified) |
| backend | Backend::lapack uses LAPACKE_dgesdd (default when available). Backend::omp parallelises Jacobi column-update loops. Backend::seq uses our one-sided Jacobi implementation. |
| tol | Jacobi convergence tolerance (ignored for Backend::lapack) |
| max_sweeps | Jacobi sweep cap (ignored for Backend::lapack) |
Definition at line 31 of file svd.cpp.
References lapack, num::backends::lapack::svd(), and num::backends::seq::svd().
Referenced by svd_truncated().
| SVDResult num::svd_truncated | ( | const Matrix & | A, |
| idx | k, | ||
| Backend | backend = default_backend, |
||
| idx | oversampling = 10, |
||
| Rng * | rng = nullptr |
||
| ) |
Randomized truncated SVD – top-k singular triplets.
Efficient when k << min(m,n). The two dominant costs – Y = A*Omega and B = Q^T*A – are dispatched via the given backend.
| A | Input matrix |
| k | Number of singular values/vectors to compute |
| backend | Backend for internal matmul calls (default: default_backend) |
| oversampling | Extra random vectors for accuracy (default 10) |
| rng | Random number generator (default: seeded from hardware) |
Definition at line 50 of file svd.cpp.
References num::Matrix::cols(), matmul(), num::QRResult::Q, qr(), rng_normal(), num::Matrix::rows(), num::SVDResult::S, svd(), num::SVDResult::U, and num::SVDResult::Vt.
| void num::thomas | ( | const Vector & | a, |
| const Vector & | b, | ||
| const Vector & | c, | ||
| const Vector & | d, | ||
| Vector & | x, | ||
| Backend | backend = lapack_backend |
||
| ) |
Thomas algorithm (LU for tridiagonal systems), O(n).
| a | Sub-diagonal, size n-1 |
| b | Main diagonal, size n |
| c | Super-diagonal, size n-1 |
| d | Right-hand side vector, size n |
| x | Solution vector (output), size n |
| backend | Backend::lapack uses LAPACKE_dgtsv (default when available). Backend::seq uses our 3-sweep O(n) implementation. Backend::gpu uses CUDA batched tridiagonal (batch=1). |
Definition at line 17 of file thomas.cpp.
References cg(), gpu, num::BasicVector< T >::gpu_data(), lapack, num::BasicVector< T >::size(), num::backends::lapack::thomas(), num::backends::seq::thomas(), num::cuda::thomas_batched(), num::BasicVector< T >::to_cpu(), and num::BasicVector< T >::to_gpu().
|
constexprnoexcept |
Trapezoidal rule with n panels.
| backend | Backend::omp parallelises the panel sum |
Definition at line 14 of file quadrature.cpp.
| VerletSteps num::verlet | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {} |
||
| ) |
Definition at line 319 of file ode.cpp.
Referenced by ode_verlet().
| Yoshida4Steps num::yoshida4 | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| ODEParams | p = {} |
||
| ) |
Definition at line 322 of file ode.cpp.
Referenced by ode_yoshida4().
zeta(x) – Riemann zeta function
Definition at line 239 of file math.hpp.
Referenced by num::backends::seq::svd().
|
inlineconstexpr |
Best backend for memory-bound vector ops (dot, axpy, scale, norm). Identical to default_backend – both follow blas > omp > simd > blocked.
Definition at line 105 of file policy.hpp.
|
inlineconstexpr |
Definition at line 34 of file policy.hpp.
|
inlineconstexpr |
Definition at line 32 of file policy.hpp.
|
inlineconstexpr |
Default backend for dense vector/matrix ops (matmul, matvec, dot, axpy, etc.). Automatically selected at configure time: blas > omp > simd > blocked.
Definition at line 92 of file policy.hpp.
|
constexpr |
Definition at line 43 of file math.hpp.
Referenced by autocorr_time(), brent(), cg(), cg_matfree(), num::backends::lapack::eig_sym(), num::backends::omp::eig_sym(), num::backends::seq::eig_sym(), expv(), gauss_seidel(), gmres(), inverse_iteration(), jacobi(), lanczos(), num::backends::seq::lu(), lu_inv(), newton(), num::detail::normalise(), num::viz::phase_hsv_color(), power_iteration(), num::backends::seq::qr(), secant(), num::FieldSolver::solve_var_poisson(), num::backends::lapack::svd(), and num::backends::seq::svd().
|
inlineconstexpr |
Definition at line 36 of file policy.hpp.
|
constexpr |
|
inlineconstexpr |
True when a BLAS/cblas library was found at configure time.
Definition at line 41 of file policy.hpp.
|
inlineconstexpr |
True when LAPACKE was found at configure time.
Definition at line 49 of file policy.hpp.
|
inlineconstexpr |
True when OpenMP was found at configure time.
Definition at line 57 of file policy.hpp.
|
inlineconstexpr |
True when a SIMD ISA was detected (AVX2 on x86-64, NEON on AArch64).
Definition at line 65 of file policy.hpp.
|
constexpr |
|
inlineconstexpr |
Definition at line 37 of file policy.hpp.
|
inlineconstexpr |
Best backend for factorizations, SVD, and eigensolvers. Prefers LAPACKE (industry-standard), then omp, then seq.
Definition at line 124 of file policy.hpp.
|
inlineconstexpr |
Definition at line 35 of file policy.hpp.
|
constexpr |
Golden ratio.
Definition at line 44 of file math.hpp.
Referenced by num::MagneticSolver::current_density(), ellint_Ei(), ellint_F(), ellint_Pi_inc(), num::FieldSolver::gradient(), gradient_3d(), num::FieldSolver::solve_poisson(), and num::FieldSolver::solve_var_poisson().
|
constexpr |
Definition at line 42 of file math.hpp.
Referenced by num::pde::poisson2d(), and num::pde::poisson2d_fd().
|
inlineconstexpr |
Definition at line 31 of file policy.hpp.
|
inlineconstexpr |
Definition at line 33 of file policy.hpp.
|
constexpr |
|
constexpr |