|
numerics
|
Namespaces | |
| namespace | backends |
| namespace | cuda |
| namespace | detail |
| namespace | markov |
| namespace | mpi |
| namespace | pde |
| namespace | quantum |
| namespace | spectral |
Classes | |
| class | BandedMatrix |
| Banded matrix with efficient storage for direct solvers. More... | |
| struct | BandedSolverResult |
| Result from banded solver. More... | |
| class | BasicVector |
| Dense vector with optional GPU storage, templated over scalar type T. More... | |
| class | CellList2D |
| class | CellList3D |
| class | Circuit |
| Chainable quantum circuit builder. More... | |
| struct | ClusterResult |
| struct | ComplexTriDiag |
| Constant-coefficient complex tridiagonal solver (precomputed LU). More... | |
| struct | CrankNicolsonADI |
| struct | EigenResult |
| Full eigendecomposition result: A = V * diag(values) * V^T. More... | |
| class | FieldSolver |
| struct | GateView |
| Compact gate description used by visualisation code. More... | |
| struct | GivensRotation |
| Givens plane rotation: G = [c, s; -s, c]. More... | |
| class | Grid3D |
| struct | Histogram |
| struct | IntRange |
| Lightweight read-only range over a contiguous int array (C++17-safe span). More... | |
| struct | LanczosResult |
| Result of a Lanczos eigensolver. More... | |
| struct | LUResult |
| Result of an LU factorization with partial pivoting (PA = LU) More... | |
| class | MagneticSolver |
| class | Matrix |
| Dense row-major matrix with optional GPU storage. More... | |
| struct | ODEResult |
| Result returned by general ODE integrators. More... | |
| struct | PBCLattice2D |
| 4-neighbor periodic-boundary index arrays for an N×N lattice. More... | |
| struct | PowerResult |
| Result of a single-eigenvalue iteration. More... | |
| struct | QRResult |
| Result of a QR factorization: A = Q * R. More... | |
| struct | Result |
| Measurement outcome from Circuit::run() More... | |
| struct | Rng |
| Seeded pseudo-random number generator (Mersenne Twister). Pass a pointer to rng_* functions to draw samples. More... | |
| struct | RootResult |
| struct | RunningStats |
| class | ScalarField3D |
| struct | SmallMatrix |
| Constexpr fixed-size row-major matrix (stack-allocated). More... | |
| struct | SmallVec |
| Constexpr fixed-size dense vector (stack-allocated). More... | |
| struct | SolverResult |
| class | SparseMatrix |
| Sparse matrix in Compressed Sparse Row (CSR) format. More... | |
| struct | SPHKernel |
| Dimension-generic SPH smoothing kernels. Dim = 2 or 3. More... | |
| struct | SVDResult |
| Result of a Singular Value Decomposition: A = U * diag(S) * Vᵀ More... | |
| struct | SymplecticResult |
| Result returned by symplectic integrators. More... | |
| struct | VectorField3D |
| class | VerletList2D |
Typedefs | |
| using | ScalarFn = std::function< real(real)> |
| Real-valued scalar function f(x) | |
| using | VectorFn = std::function< void(real, real *, real *)> |
| Real-valued multivariate function f(x) where x is a scalar parameter and the function returns a vector of residuals – used in nonlinear systems. | |
| using | real = double |
| using | idx = std::size_t |
| using | cplx = std::complex< real > |
| using | Vector = BasicVector< real > |
| Real-valued dense vector with full backend dispatch (CPU + GPU) | |
| using | CVector = BasicVector< cplx > |
| Complex-valued dense vector (sequential; no GPU) | |
| using | MatVecFn = std::function< void(const Vector &, Vector &)> |
| Callable type for matrix-free matvec: computes y = A*x. | |
| using | ODERhsFn = std::function< void(real t, const Vector &y, Vector &dydt)> |
| Right-hand side callable: fills dydt = f(t, y) in-place. | |
| using | StepCallback = std::function< void(real t, const Vector &y)> |
| using | AccelFn = std::function< void(const Vector &q, Vector &acc)> |
| using | SymplecticCallback = std::function< void(real t, const Vector &q, const Vector &v)> |
Enumerations | |
| enum class | Backend { seq , blocked , simd , blas , omp , gpu } |
| Selects which backend handles a linalg operation. More... | |
Functions | |
| real | trapz (ScalarFn f, real a, real b, idx n=100, Backend backend=Backend::seq) |
| Trapezoidal rule with n panels. | |
| real | simpson (ScalarFn f, real a, real b, idx n=100, Backend backend=Backend::seq) |
| Simpson's 1/3 rule with n panels (n must be even) | |
| real | gauss_legendre (ScalarFn f, real a, real b, idx p=5) |
| Gauss-Legendre quadrature (exact for polynomials up to degree 2p-1) | |
| real | adaptive_simpson (ScalarFn f, real a, real b, real tol=1e-8, idx max_depth=50) |
| Adaptive Simpson quadrature. | |
| real | romberg (ScalarFn f, real a, real b, real tol=1e-10, idx max_levels=12) |
| Romberg integration (Richardson extrapolation on trapezoidal rule) | |
| RootResult | bisection (ScalarFn f, real a, real b, real tol=1e-10, idx max_iter=1000) |
| Bisection method. | |
| RootResult | newton (ScalarFn f, ScalarFn df, real x0, real tol=1e-10, idx max_iter=1000) |
| Newton-Raphson method. | |
| RootResult | secant (ScalarFn f, real x0, real x1, real tol=1e-10, idx max_iter=1000) |
| Secant method (Newton without derivative) | |
| RootResult | brent (ScalarFn f, real a, real b, real tol=1e-10, idx max_iter=1000) |
| Brent's method (bisection + secant + inverse quadratic interpolation) | |
| void | matvec (const Matrix &A, const Vector &x, Vector &y, Backend b=default_backend) |
| y = A * x | |
| void | matmul (const Matrix &A, const Matrix &B, Matrix &C, Backend b=default_backend) |
| C = A * B. | |
| void | matadd (real alpha, const Matrix &A, real beta, const Matrix &B, Matrix &C, Backend b=default_backend) |
| C = alpha*A + beta*B. | |
| void | matmul_blocked (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64) |
| C = A * B (cache-blocked) | |
| void | matmul_register_blocked (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64, idx reg_size=4) |
| C = A * B (register-blocked) | |
| void | matmul_simd (const Matrix &A, const Matrix &B, Matrix &C, idx block_size=64) |
| C = A * B (SIMD-accelerated) | |
| void | matvec_simd (const Matrix &A, const Vector &x, Vector &y) |
| y = A * x (SIMD-accelerated) | |
| template<idx N> | |
| constexpr SmallVec< N > | operator+ (SmallVec< N > a, const SmallVec< N > &b) noexcept |
| template<idx N> | |
| constexpr SmallVec< N > | operator* (real s, SmallVec< N > v) noexcept |
| template<int N, typename T > | |
| constexpr T | ipow (T x) noexcept |
| Compute x^N at compile time via repeated squaring. | |
| real | bessel_j (real nu, real x) |
| J_nu(x) – Bessel function of the first kind. | |
| real | bessel_y (real nu, real x) |
| Y_nu(x) – Bessel function of the second kind (Neumann function) | |
| real | bessel_i (real nu, real x) |
| I_nu(x) – modified Bessel function of the first kind. | |
| real | bessel_k (real nu, real x) |
| K_nu(x) – modified Bessel function of the second kind. | |
| real | sph_bessel_j (unsigned int n, real x) |
| j_n(x) – spherical Bessel function of the first kind | |
| real | sph_bessel_y (unsigned int n, real x) |
| y_n(x) – spherical Neumann function (spherical Bessel of the second kind) | |
| real | legendre (unsigned int n, real x) |
| P_n(x) – Legendre polynomial of degree n. | |
| real | assoc_legendre (unsigned int n, unsigned int m, real x) |
| P_n^m(x) – associated Legendre polynomial. | |
| real | sph_legendre (unsigned int l, unsigned int m, real theta) |
| Y_l^m(theta) – spherical harmonic (real part, theta in radians) | |
| real | hermite (unsigned int n, real x) |
| H_n(x) – (physicists') Hermite polynomial. | |
| real | laguerre (unsigned int n, real x) |
| L_n(x) – Laguerre polynomial. | |
| real | assoc_laguerre (unsigned int n, unsigned int m, real x) |
| L_n^m(x) – associated Laguerre polynomial. | |
| real | ellint_K (real k) |
| K(k) – complete elliptic integral of the first kind. | |
| real | ellint_E (real k) |
| E(k) – complete elliptic integral of the second kind. | |
| real | ellint_Pi (real n, real k) |
| Pi(n, k) – complete elliptic integral of the third kind. | |
| real | ellint_F (real k, real phi) |
| F(k, phi) – incomplete elliptic integral of the first kind. | |
| real | ellint_Ei (real k, real phi) |
| E(k, phi) – incomplete elliptic integral of the second kind. | |
| real | ellint_Pi_inc (real n, real k, real phi) |
| Pi(n, k, phi) – incomplete elliptic integral of the third kind. | |
| real | expint (real x) |
| Ei(x) – exponential integral. | |
| real | zeta (real x) |
| zeta(x) – Riemann zeta function | |
| real | beta (real a, real b) |
| B(a, b) – beta function. | |
| std::vector< real > | linspace (real start, real stop, idx n) |
| Evenly spaced values from start to stop, inclusive. MATLAB/NumPy linspace. | |
| std::vector< int > | int_range (int start, int n) |
| Integer sequence [start, start+1, ..., start+n-1]. Wraps std::iota. | |
| real | rng_uniform (Rng *r, real lo, real hi) |
| Uniform real in [lo, hi). | |
| real | rng_normal (Rng *r, real mean, real stddev) |
| Normal (Gaussian) sample with given mean and standard deviation. | |
| int | rng_int (Rng *r, int lo, int hi) |
| Uniform integer in [lo, hi] (inclusive on both ends). | |
| void | scale (Vector &v, real alpha, Backend b=default_backend) |
| v *= alpha | |
| void | add (const Vector &x, const Vector &y, Vector &z, Backend b=default_backend) |
| z = x + y | |
| void | axpy (real alpha, const Vector &x, Vector &y, Backend b=default_backend) |
| y += alpha * x | |
| real | dot (const Vector &x, const Vector &y, Backend b=default_backend) |
| dot product | |
| real | norm (const Vector &x, Backend b=default_backend) |
| Euclidean norm. | |
| void | scale (CVector &v, cplx alpha) |
| v *= alpha | |
| void | axpy (cplx alpha, const CVector &x, CVector &y) |
| y += alpha * x | |
| cplx | dot (const CVector &x, const CVector &y) |
| Conjugate inner product <x, y> = Sigma conj(x_i) * y_i. | |
| real | norm (const CVector &x) |
| Euclidean norm sqrt(Sigma |v_i|^2) | |
| BandedSolverResult | banded_lu (BandedMatrix &A, idx *ipiv) |
| LU factorization of banded matrix with partial pivoting. | |
| void | banded_lu_solve (const BandedMatrix &A, const idx *ipiv, Vector &b) |
| Solve banded system using precomputed LU factorization. | |
| void | banded_lu_solve_multi (const BandedMatrix &A, const idx *ipiv, real *B, idx nrhs) |
| Solve multiple right-hand sides using LU factorization. | |
| BandedSolverResult | banded_solve (const BandedMatrix &A, const Vector &b, Vector &x) |
| Solve banded system Ax = b (convenience function) | |
| void | banded_matvec (const BandedMatrix &A, const Vector &x, Vector &y, Backend backend=default_backend) |
| Banded matrix-vector product y = A*x. | |
| void | banded_gemv (real alpha, const BandedMatrix &A, const Vector &x, real beta, Vector &y, Backend backend=default_backend) |
| Banded matrix-vector product y = alpha*A*x + beta*y. | |
| real | banded_rcond (const BandedMatrix &A, const idx *ipiv, real anorm) |
| Estimate reciprocal condition number of banded matrix. | |
| real | banded_norm1 (const BandedMatrix &A) |
| Compute 1-norm of banded matrix. | |
| EigenResult | eig_sym (const Matrix &A, real tol=1e-12, idx max_sweeps=100, Backend backend=Backend::seq) |
| Full eigendecomposition of a real symmetric matrix. | |
| LanczosResult | lanczos (MatVecFn matvec, idx n, idx k, real tol=1e-10, idx max_steps=0, Backend backend=Backend::seq) |
| Lanczos eigensolver for large sparse symmetric matrices. | |
| PowerResult | power_iteration (const Matrix &A, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Power iteration – finds the eigenvalue largest in absolute value. | |
| PowerResult | inverse_iteration (const Matrix &A, real sigma, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Inverse iteration – finds the eigenvalue closest to a shift sigma. | |
| PowerResult | rayleigh_iteration (const Matrix &A, const Vector &x0, real tol=1e-10, idx max_iter=50, Backend backend=default_backend) |
| Rayleigh quotient iteration – cubically convergent. | |
| Vector | expv (real t, const MatVecFn &matvec, idx n, const Vector &v, int m_max=30, real tol=1e-8) |
| Compute exp(t*A)*v via Krylov-Padé approximation (matrix-free) | |
| Vector | expv (real t, const SparseMatrix &A, const Vector &v, int m_max=30, real tol=1e-8) |
| Compute exp(t*A)*v via Krylov-Padé approximation (sparse matrix) | |
| LUResult | lu (const Matrix &A) |
| LU factorization of a square matrix A with partial pivoting. | |
| void | lu_solve (const LUResult &f, const Vector &b, Vector &x) |
| Solve A*x = b using a precomputed LU factorization. | |
| void | lu_solve (const LUResult &f, const Matrix &B, Matrix &X) |
| Solve A*X = B for multiple right-hand sides. | |
| real | lu_det (const LUResult &f) |
| Determinant of A from its LU factorization. | |
| Matrix | lu_inv (const LUResult &f) |
| Inverse of A from its LU factorization. | |
| QRResult | qr (const Matrix &A) |
| QR factorization of an mxn matrix A (m >= n) via Householder reflections. | |
| void | qr_solve (const QRResult &f, const Vector &b, Vector &x) |
| Solve the least-squares problem min ||A*x - b||_2. | |
| void | thomas (const Vector &a, const Vector &b, const Vector &c, const Vector &d, Vector &x, Backend backend=Backend::seq) |
| Thomas algorithm (LU for tridiagonal systems), O(n). | |
| SolverResult | cg (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Conjugate gradient solver for Ax = b. | |
| SolverResult | cg_matfree (MatVecFn matvec, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000) |
| Matrix-free conjugate gradient for Ax = b where A is SPD. | |
| SolverResult | gauss_seidel (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Gauss-Seidel iterative solver for Ax = b. | |
| SolverResult | jacobi (const Matrix &A, const Vector &b, Vector &x, real tol=1e-10, idx max_iter=1000, Backend backend=default_backend) |
| Jacobi iterative solver for Ax = b. | |
| SolverResult | gmres (MatVecFn matvec, idx n, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30) |
| Restarted GMRES(restart) – matrix-free interface. | |
| SolverResult | gmres (const SparseMatrix &A, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30) |
| Restarted GMRES with a sparse (CSR) matrix. | |
| SolverResult | gmres (const Matrix &A, const Vector &b, Vector &x, real tol=1e-6, idx max_iter=1000, idx restart=30, Backend backend=default_backend) |
| Restarted GMRES with a dense matrix. | |
| void | sparse_matvec (const SparseMatrix &A, const Vector &x, Vector &y) |
| y = A * x | |
| SVDResult | svd (const Matrix &A, Backend backend=Backend::seq, real tol=1e-12, idx max_sweeps=100) |
| Full SVD of an mxn matrix via one-sided Jacobi. | |
| SVDResult | svd_truncated (const Matrix &A, idx k, Backend backend=default_backend, idx oversampling=10, Rng *rng=nullptr) |
| Randomized truncated SVD – top-k singular triplets. | |
| ODEResult | ode_euler (ODERhsFn f, Vector y0, real t0, real t1, real h, StepCallback on_step=nullptr) |
| ODEResult | ode_rk4 (ODERhsFn f, Vector y0, real t0, real t1, real h, StepCallback on_step=nullptr) |
| Classic 4th-order Runge-Kutta (fixed step). | |
| ODEResult | ode_rk45 (ODERhsFn f, Vector y0, real t0, real t1, real rtol=1e-6, real atol=1e-9, real h0=1e-3, idx max_steps=1000000, StepCallback on_step=nullptr) |
| SymplecticResult | ode_verlet (AccelFn accel, Vector q0, Vector v0, real t0, real t1, real h, SymplecticCallback on_step=nullptr) |
| SymplecticResult | ode_yoshida4 (AccelFn accel, Vector q0, Vector v0, real t0, real t1, real h, SymplecticCallback on_step=nullptr) |
| template<typename T > | |
| void | laplacian_stencil_2d (const BasicVector< T > &x, BasicVector< T > &y, int N) |
| template<typename T > | |
| void | laplacian_stencil_2d_periodic (const BasicVector< T > &x, BasicVector< T > &y, int N) |
| template<typename T , typename F > | |
| void | col_fiber_sweep (BasicVector< T > &data, int N, F &&f) |
| template<typename T , typename F > | |
| void | row_fiber_sweep (BasicVector< T > &data, int N, F &&f) |
| void | neg_laplacian_3d (const Vector &x, Vector &y, int nx, int ny, int nz, double inv_dx2) |
| void | gradient_3d (const Grid3D &phi, Grid3D &gx, Grid3D &gy, Grid3D &gz) |
| void | divergence_3d (const Grid3D &fx, const Grid3D &fy, const Grid3D &fz, Grid3D &out) |
| void | curl_3d (const Grid3D &ax, const Grid3D &ay, const Grid3D &az, Grid3D &bx, Grid3D &by, Grid3D &bz) |
| template<typename InCluster , typename Neighbors > | |
| ClusterResult | connected_components (int n_sites, InCluster &&in_cluster, Neighbors &&neighbors) |
| real | autocorr_time (const real *data, idx n, real c=6.0) |
| EigenResult | eig_sym (const Matrix &A_in, real tol, idx max_sweeps) |
| Circuit | bell_pair (int q0=0, int q1=1) |
| Bell state |Phi+> = (|00> + |11>)/sqrt2 on qubits q0, q1. | |
| Circuit | ghz_state (int n_qubits) |
| n-qubit GHZ state (|00...0> + |11...1>)/sqrt2 | |
| Circuit | qft_circuit (int n_qubits) |
| n-qubit Quantum Fourier Transform circuit Applied to |0...0> unless you prepend state-preparation gates. | |
| void | print_state (const quantum::Statevector &sv, int n_qubits, real threshold=1e-6) |
| Pretty-print a statevector, hiding amplitudes below threshold. | |
Variables | |
| constexpr Backend | seq = Backend::seq |
| constexpr Backend | blocked = Backend::blocked |
| constexpr Backend | simd = Backend::simd |
| constexpr Backend | blas = Backend::blas |
| constexpr Backend | omp = Backend::omp |
| constexpr Backend | gpu = Backend::gpu |
| constexpr bool | has_blas |
| True when a BLAS/cblas library was found at configure time. | |
| constexpr bool | has_omp |
| True when OpenMP was found at configure time. | |
| constexpr Backend | default_backend |
| constexpr Backend | best_backend |
| Best backend for memory-bound vector ops: blas > omp > blocked. | |
| constexpr real | pi = 3.14159265358979323846 |
| constexpr real | e = 2.71828182845904523536 |
| constexpr real | phi = 1.61803398874989484820 |
| Golden ratio. | |
| constexpr real | sqrt2 = 1.41421356237309504880 |
| constexpr real | sqrt3 = 1.73205080756887729353 |
| constexpr real | ln2 = 0.69314718055994530942 |
| constexpr real | inv_pi = 0.31830988618379067154 |
| 1/pi | |
| constexpr real | two_pi = 6.28318530717958647692 |
| 2pi | |
| constexpr real | half_pi = 1.57079632679489661923 |
| pi/2 | |
Complex-valued dense vector (sequential; no GPU)
Definition at line 125 of file vector.hpp.
| using num::ScalarFn = typedef std::function<real(real)> |
Real-valued dense vector with full backend dispatch (CPU + GPU)
Definition at line 122 of file vector.hpp.
|
strong |
Selects which backend handles a linalg operation.
Definition at line 19 of file policy.hpp.
Adaptive Simpson quadrature.
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| tol | Absolute error tolerance |
| max_depth | Maximum recursion depth |
Definition at line 92 of file quadrature.cpp.
References ipow().
z = x + y
Definition at line 38 of file vector.cpp.
References num::cuda::add(), num::backends::seq::add(), gpu, num::BasicVector< T >::gpu_data(), ipow(), and num::BasicVector< T >::size().
Integrated autocorrelation time tau_int from a stored time series.
Uses the automatic windowing criterion (Madras & Sokal 1988): accumulate C(t)/C(0) until W > c*tau_int (c = 6 default). Returns tau_int >= 0.5; returns 0.5 on failure.
| data | Pointer to time series of length n |
| n | Length of the series |
| c | Window parameter (default 6) |
y += alpha * x
Definition at line 87 of file vector.cpp.
References ipow(), and num::BasicVector< T >::size().
y += alpha * x
Definition at line 46 of file vector.cpp.
References num::backends::blas::axpy(), num::backends::gpu::axpy(), num::backends::omp::axpy(), num::backends::seq::axpy(), blas, blocked, gpu, ipow(), omp, seq, and simd.
Referenced by cg(), cg_matfree(), num::pde::diffusion_step_2d(), num::pde::diffusion_step_2d_dirichlet(), and expv().
| void num::banded_gemv | ( | real | alpha, |
| const BandedMatrix & | A, | ||
| const Vector & | x, | ||
| real | beta, | ||
| Vector & | y, | ||
| Backend | backend = default_backend |
||
| ) |
Banded matrix-vector product y = alpha*A*x + beta*y.
Generalized matrix-vector multiply with scaling factors.
| alpha | Scalar multiplier for A*x |
| A | Banded matrix |
| x | Input vector |
| beta | Scalar multiplier for y |
| y | Input/output vector |
| backend | Execution backend (Backend::gpu uses CUDA path when available) |
Definition at line 268 of file banded.cpp.
References beta(), num::BasicVector< T >::data(), gpu, ipow(), and num::BasicVector< T >::size().
Referenced by banded_matvec().
| BandedSolverResult num::banded_lu | ( | BandedMatrix & | A, |
| idx * | ipiv | ||
| ) |
LU factorization of banded matrix with partial pivoting.
Computes A = P * L * U where:
The factorization is performed in-place. After factorization:
| A | Banded matrix (modified in place with LU factors) |
| ipiv | Pivot indices (size n, allocated by caller) |
Complexity: O(n * kl * (kl + ku)) operations
Definition at line 111 of file banded.cpp.
References ipow().
Referenced by banded_solve().
Solve banded system using precomputed LU factorization.
Solves A*x = b where A has been factored by banded_lu(). The solution overwrites the right-hand side vector.
| A | LU-factored banded matrix from banded_lu() |
| ipiv | Pivot indices from banded_lu() |
| b | Right-hand side (overwritten with solution) |
Complexity: O(n * (kl + ku)) operations
Definition at line 175 of file banded.cpp.
References ipow().
Referenced by banded_rcond(), and banded_solve().
Solve multiple right-hand sides using LU factorization.
Solves A*X = B where B has nrhs columns stored contiguously.
| A | LU-factored banded matrix |
| ipiv | Pivot indices |
| B | Right-hand sides (n x nrhs, column-major, overwritten with solutions) |
| nrhs | Number of right-hand sides |
Definition at line 213 of file banded.cpp.
References ipow().
| void num::banded_matvec | ( | const BandedMatrix & | A, |
| const Vector & | x, | ||
| Vector & | y, | ||
| Backend | backend = default_backend |
||
| ) |
Banded matrix-vector product y = A*x.
Optimized for cache efficiency with loop ordering that accesses the band storage in column-major order.
| A | Banded matrix |
| x | Input vector |
| y | Output vector (overwritten) |
| backend | Execution backend (Backend::gpu uses CUDA path when available) |
Definition at line 264 of file banded.cpp.
References banded_gemv(), and ipow().
| real num::banded_norm1 | ( | const BandedMatrix & | A | ) |
Compute 1-norm of banded matrix.
| A | Banded matrix |
Definition at line 305 of file banded.cpp.
References ipow().
Estimate reciprocal condition number of banded matrix.
Uses 1-norm condition number estimation after LU factorization.
| A | LU-factored banded matrix |
| ipiv | Pivot indices |
| anorm | 1-norm of original matrix (before factorization) |
Definition at line 321 of file banded.cpp.
References banded_lu_solve(), and ipow().
| BandedSolverResult num::banded_solve | ( | const BandedMatrix & | A, |
| const Vector & | b, | ||
| Vector & | x | ||
| ) |
Solve banded system Ax = b (convenience function)
Performs LU factorization and solve in one call. For solving multiple systems with the same matrix, use banded_lu() and banded_lu_solve() separately to avoid redundant factorization.
| A | Banded matrix (NOT modified - internal copy made) |
| b | Right-hand side |
| x | Solution vector (output) |
Definition at line 247 of file banded.cpp.
References banded_lu(), banded_lu_solve(), ipow(), and num::BasicVector< T >::size().
Bell state |Phi+> = (|00> + |11>)/sqrt2 on qubits q0, q1.
Definition at line 389 of file circuit.cpp.
References num::Circuit::cx(), and num::Circuit::h().
J_nu(x) – Bessel function of the first kind.
Definition at line 53 of file math.hpp.
Referenced by tdse::TDSESolver::compute_modes().
B(a, b) – beta function.
Definition at line 242 of file math.hpp.
References ipow().
Referenced by banded_gemv(), num::markov::boltzmann_accept(), cg(), cg_matfree(), expv(), gmres(), lanczos(), num::markov::make_boltzmann_table(), num::backends::blas::matadd(), num::backends::omp::matadd(), num::backends::seq::matadd(), matadd(), num::markov::metropolis_sweep(), svd(), and num::markov::umbrella_sweep().
Brent's method (bisection + secant + inverse quadratic interpolation)
Guaranteed to converge if f(a) and f(b) have opposite signs. Superlinear convergence near smooth roots. Preferred method when a reliable bracket is available.
| f | Function |
| a,b | Bracket: f(a) and f(b) must have opposite signs |
| tol | Convergence tolerance |
| max_iter | Maximum iterations |
Definition at line 54 of file roots.cpp.
Referenced by tdse::TDSESolver::compute_modes().
| SolverResult num::cg | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Conjugate gradient solver for Ax = b.
| A | Symmetric positive definite matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Backend for internal matvec/dot/axpy/scale (default: default_backend) |
Definition at line 8 of file cg.cpp.
References axpy(), beta(), dot(), e, gpu, ipow(), num::SolverResult::iterations, matvec(), scale(), num::BasicVector< T >::size(), num::BasicVector< T >::to_cpu(), num::cuda::to_device(), and num::BasicVector< T >::to_gpu().
Referenced by thomas().
| SolverResult num::cg_matfree | ( | MatVecFn | matvec, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000 |
||
| ) |
Matrix-free conjugate gradient for Ax = b where A is SPD.
| matvec | Callable computing y = A*x |
| b | Right-hand side |
| x | Initial guess (modified in-place -> solution) |
| tol | Convergence tolerance on residual norm |
| max_iter | Maximum CG iterations |
Definition at line 66 of file cg.cpp.
References axpy(), beta(), dot(), e, ipow(), num::SolverResult::iterations, scale(), and seq.
Referenced by num::FieldSolver::solve_poisson(), and physics::ElectricSolver::solve_potential().
| void num::col_fiber_sweep | ( | BasicVector< T > & | data, |
| int | N, | ||
| F && | f | ||
| ) |
For each column j in [0,N), extract the x-direction fiber data[0..N-1, j] into a std::vector<T>, call f(fiber), then write back.
f must have signature: void(std::vector<T>&) and modifies in-place.
Use for ADI / Crank-Nicolson sweeps along x.
Definition at line 77 of file stencil.hpp.
References ipow().
Referenced by num::CrankNicolsonADI::sweep().
| ClusterResult num::connected_components | ( | int | n_sites, |
| InCluster && | in_cluster, | ||
| Neighbors && | neighbors | ||
| ) |
BFS connected-component labelling with pre-allocated flat queue (no heap per call).
| n_sites | Total number of sites |
| in_cluster | bool(int i) – include site i? |
| neighbors | void(int i, auto&& visit) – call visit(nb) per neighbor of i |
Definition at line 34 of file connected_components.hpp.
References num::ClusterResult::id, and ipow().
Referenced by IsingLattice::sweep_umbrella().
|
inline |
Compute the central-difference curl ∇×A of a vector field. One-sided differences at domain boundaries. bx, by, bz must already be allocated with the same dimensions as ax, ay, az.
Definition at line 170 of file stencil.hpp.
References num::Grid3D::dx(), ipow(), num::Grid3D::nx(), num::Grid3D::ny(), and num::Grid3D::nz().
Referenced by num::FieldSolver::curl().
|
inline |
Compute the central-difference divergence of a vector field (fx, fy, fz). One-sided differences at domain boundaries. out must already be allocated with the same dimensions.
Definition at line 151 of file stencil.hpp.
References ipow(), and num::Grid3D::nx().
Referenced by num::FieldSolver::divergence().
Conjugate inner product <x, y> = Sigma conj(x_i) * y_i.
Definition at line 91 of file vector.cpp.
References ipow(), and num::BasicVector< T >::size().
dot product
Definition at line 57 of file vector.cpp.
References blas, blocked, num::backends::blas::dot(), num::backends::gpu::dot(), num::backends::omp::dot(), num::backends::seq::dot(), gpu, ipow(), omp, seq, and simd.
Referenced by cg(), cg_matfree(), num::mpi::dot(), IsingLattice::energy_per_spin(), expv(), inverse_iteration(), IsingLattice::magnetization(), num::mpi::norm(), power_iteration(), and rayleigh_iteration().
|
inline |
Full eigendecomposition of a real symmetric matrix.
The rotation accumulation loop is parallelised when backend == Backend::omp.
| A | Real symmetric nxn matrix (upper/lower triangle used) |
| backend | Execution backend (Backend::omp parallelises the rotation loop) |
| tol | Convergence: stop when Frobenius norm of off-diagonal < tol |
| max_sweeps | Safety cap on Jacobi sweeps (typically < 20 for n < 500) |
Definition at line 50 of file jacobi_eig.hpp.
Referenced by lanczos().
| EigenResult num::eig_sym | ( | const Matrix & | A_in, |
| real | tol, | ||
| idx | max_sweeps | ||
| ) |
| Vector num::expv | ( | real | t, |
| const MatVecFn & | matvec, | ||
| idx | n, | ||
| const Vector & | v, | ||
| int | m_max = 30, |
||
| real | tol = 1e-8 |
||
| ) |
Compute exp(t*A)*v via Krylov-Padé approximation (matrix-free)
| t | Scalar multiplier on A |
| matvec | Callable y = A*x |
| n | Dimension of the state space (size of v) |
| v | Input vector |
| m_max | Maximum Krylov dimension (default 30) |
| tol | Breakdown tolerance for Arnoldi (default 1e-8) |
Definition at line 110 of file expv.cpp.
References axpy(), beta(), dot(), e, ipow(), matvec(), and norm().
Referenced by expv().
| Vector num::expv | ( | real | t, |
| const SparseMatrix & | A, | ||
| const Vector & | v, | ||
| int | m_max = 30, |
||
| real | tol = 1e-8 |
||
| ) |
Compute exp(t*A)*v via Krylov-Padé approximation (sparse matrix)
| t | Scalar multiplier on A |
| A | Sparse matrix in CSR format |
| v | Input vector |
| m_max | Maximum Krylov dimension (default 30) |
| tol | Breakdown tolerance for Arnoldi (default 1e-8) |
Definition at line 180 of file expv.cpp.
References expv(), ipow(), and sparse_matvec().
Gauss-Legendre quadrature (exact for polynomials up to degree 2p-1)
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| p | Number of quadrature points (1 to 5 supported) |
Definition at line 59 of file quadrature.cpp.
References ipow().
| SolverResult num::gauss_seidel | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Gauss-Seidel iterative solver for Ax = b.
Updates each component x[i] in-place using the latest values of all other components. Converges for strictly diagonally dominant or symmetric positive definite A.
With Backend::omp the residual computation is parallelised; the update sweep remains sequential to preserve convergence properties.
| A | Square matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Execution backend (default: default_backend) |
Definition at line 13 of file gauss_seidel.cpp.
References e, ipow(), and num::BasicVector< T >::size().
n-qubit GHZ state (|00...0> + |11...1>)/sqrt2
Definition at line 395 of file circuit.cpp.
References num::Circuit::cx(), and num::Circuit::h().
| SolverResult num::gmres | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30, |
||
| Backend | backend = default_backend |
||
| ) |
Restarted GMRES with a dense matrix.
| backend | Backend for the internal matvec at each Arnoldi step |
Definition at line 126 of file krylov.cpp.
| SolverResult num::gmres | ( | const SparseMatrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30 |
||
| ) |
Restarted GMRES with a sparse (CSR) matrix.
Definition at line 113 of file krylov.cpp.
References gmres(), ipow(), and sparse_matvec().
| SolverResult num::gmres | ( | MatVecFn | matvec, |
| idx | n, | ||
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-6, |
||
| idx | max_iter = 1000, |
||
| idx | restart = 30 |
||
| ) |
Restarted GMRES(restart) – matrix-free interface.
Works for any invertible A (symmetric or non-symmetric, indefinite). The Krylov subspace is restarted every restart steps to bound memory.
| matvec | Callable computing y = A*x |
| n | System dimension |
| b | Right-hand side |
| x | Initial guess (modified in-place -> solution) |
| tol | Convergence tolerance on residual norm |
| max_iter | Maximum total matrix-vector products |
| restart | Krylov subspace size before restart (default 30) |
Definition at line 11 of file krylov.cpp.
References beta(), e, ipow(), and num::BasicVector< T >::size().
Compute the central-difference gradient of a scalar field. One-sided differences at domain boundaries. gx, gy, gz must already be allocated with the same dimensions as phi.
Definition at line 132 of file stencil.hpp.
Referenced by num::FieldSolver::gradient().
| PowerResult num::inverse_iteration | ( | const Matrix & | A, |
| real | sigma, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Inverse iteration – finds the eigenvalue closest to a shift sigma.
Factorizes (A - sigmaI) once then solves repeatedly.
| A | Square matrix (symmetric recommended) |
| sigma | Shift – should be near the target eigenvalue |
| tol | Tolerance on eigenvalue change between iterations |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec and dot |
Definition at line 48 of file power.cpp.
References dot(), e, ipow(), lu(), lu_solve(), matvec(), and num::detail::normalise().
|
constexprnoexcept |
Compute x^N at compile time via repeated squaring.
| N | Non-negative integer exponent (must be a compile-time constant). |
| T | Arithmetic type (float, double, int, ...). |
Definition at line 34 of file integer_pow.hpp.
References ipow().
Referenced by adaptive_simpson(), num::backends::seq::add(), add(), num::mpi::allreduce_sum(), num::GivensRotation::apply(), num::quantum::apply_1q(), num::quantum::apply_2q(), num::quantum::apply_cnot(), num::quantum::apply_cp(), num::quantum::apply_cswap(), num::quantum::apply_cy(), num::quantum::apply_cz(), num::quantum::apply_swap(), num::GivensRotation::apply_t(), num::quantum::apply_toffoli(), assoc_laguerre(), assoc_legendre(), autocorr_time(), axpy(), num::backends::blas::axpy(), num::backends::gpu::axpy(), num::backends::omp::axpy(), num::backends::seq::axpy(), axpy(), num::BandedMatrix::band(), num::BandedMatrix::band(), banded_gemv(), banded_lu(), banded_lu_solve(), banded_lu_solve_multi(), banded_matvec(), banded_norm1(), banded_rcond(), banded_solve(), num::BandedMatrix::BandedMatrix(), num::BandedMatrix::BandedMatrix(), num::BandedMatrix::BandedMatrix(), num::BasicVector< T >::BasicVector(), num::BasicVector< T >::BasicVector(), num::BasicVector< T >::BasicVector(), num::quantum::basis_state(), beta(), num::Histogram::bin(), num::Histogram::bin_centre(), bisection(), num::markov::boltzmann_accept(), brent(), num::mpi::broadcast(), num::CellList2D< Scalar >::build(), num::CellList3D< Scalar >::build(), num::VerletList2D< Scalar >::build(), num::Circuit::ccx(), num::CellList2D< Scalar >::cell_particles(), cg(), cg_matfree(), col_fiber_sweep(), num::detail::SpikyDW< 2 >::compute(), num::detail::SpikyDW< 3 >::compute(), physics::backends::omp::compute_density_pressure(), physics::backends::seq::compute_density_pressure(), physics::backends::omp::compute_density_pressure(), physics::backends::seq::compute_density_pressure(), connected_components(), num::Circuit::cp(), num::CrankNicolsonADI::CrankNicolsonADI(), num::Circuit::cswap(), num::FieldSolver::curl(), curl_3d(), num::MagneticSolver::current_density(), num::Circuit::cx(), num::Circuit::cy(), num::Circuit::cz(), num::Circuit::diagram(), num::pde::diffusion_step_2d(), num::pde::diffusion_step_2d_dirichlet(), num::FieldSolver::divergence(), divergence_3d(), dot(), num::SmallVec< N >::dot(), num::backends::blas::dot(), num::backends::omp::dot(), num::backends::seq::dot(), dot(), num::mpi::dot(), num::SPHKernel< Dim >::dW_dr(), eig_sym(), eig_sym(), ellint_E(), ellint_Ei(), ellint_F(), ellint_K(), ellint_Pi(), ellint_Pi_inc(), num::quantum::entanglement_entropy(), num::spectral::FFTPlan::execute(), num::quantum::expectation(), num::quantum::expectation_x(), num::quantum::expectation_y(), num::quantum::expectation_z(), expv(), expv(), num::ComplexTriDiag::factor(), num::spectral::fft(), num::spectral::FFTPlan::FFTPlan(), num::Histogram::fill(), num::mpi::finalize(), num::GivensRotation::from(), num::SparseMatrix::from_triplets(), num::Grid3D::from_vector(), num::Circuit::gate(), num::Circuit::gate(), num::quantum::gate_u(), gauss_legendre(), gauss_seidel(), gmres(), gmres(), gmres(), num::FieldSolver::gradient(), gradient_3d(), num::SmallMatrix< M, N >::identity(), num::spectral::ifft(), num::BandedMatrix::in_band(), num::mpi::init(), int_range(), inverse_iteration(), ipow(), num::spectral::irfft(), num::CellList2D< Scalar >::iterate_pairs(), num::CellList3D< Scalar >::iterate_pairs(), jacobi(), lanczos(), laplacian_stencil_2d(), laplacian_stencil_2d_periodic(), linspace(), lu(), lu_det(), lu_inv(), lu_solve(), lu_solve(), num::markov::make_boltzmann_table(), num::markov::make_rng(), num::markov::make_seeded_rng(), num::backends::blas::matadd(), num::backends::omp::matadd(), num::backends::seq::matadd(), matadd(), num::backends::blas::matmul(), num::backends::gpu::matmul(), num::backends::omp::matmul(), num::backends::seq::matmul(), matmul(), num::backends::simd::matmul(), num::backends::seq::matmul_blocked(), matmul_blocked(), num::backends::seq::matmul_register_blocked(), matmul_register_blocked(), matmul_simd(), num::Matrix::Matrix(), num::Matrix::Matrix(), num::Matrix::Matrix(), num::backends::blas::matvec(), num::backends::gpu::matvec(), num::backends::omp::matvec(), num::backends::simd::matvec(), num::backends::seq::matvec(), matvec(), matvec_simd(), num::markov::metropolis_sweep(), num::markov::metropolis_sweep_prob(), num::Result::most_likely(), num::VerletList2D< Scalar >::needs_rebuild(), neg_laplacian_3d(), num::VerletList2D< Scalar >::neighbors(), newton(), norm(), num::backends::blas::norm(), num::backends::gpu::norm(), num::backends::seq::norm(), norm(), num::mpi::norm(), num::detail::normalise(), num::quantum::normalize(), ode_euler(), ode_rk4(), ode_rk45(), ode_verlet(), ode_yoshida4(), num::Matrix::operator()(), num::BandedMatrix::operator()(), num::Matrix::operator()(), num::BandedMatrix::operator()(), num::SparseMatrix::operator()(), num::SmallMatrix< M, N >::operator()(), num::SmallMatrix< M, N >::operator()(), num::Grid3D::operator()(), num::Grid3D::operator()(), num::SmallMatrix< M, N >::operator*(), num::SmallMatrix< M, N >::operator*(), num::SmallVec< N >::operator*=(), num::SmallMatrix< M, N >::operator*=(), operator+(), num::SmallMatrix< M, N >::operator+=(), num::SmallVec< N >::operator+=(), num::SmallVec< N >::operator-=(), num::BandedMatrix::operator=(), num::BasicVector< T >::operator=(), num::BandedMatrix::operator=(), num::BasicVector< T >::operator=(), num::Matrix::operator=(), num::Matrix::operator=(), num::BasicVector< T >::operator[](), num::BasicVector< T >::operator[](), num::SmallVec< N >::operator[](), num::SmallVec< N >::operator[](), num::Circuit::p(), num::PBCLattice2D::PBCLattice2D(), num::Histogram::pdf(), power_iteration(), num::Result::print(), print_state(), num::quantum::probabilities(), num::Result::probability(), qft_circuit(), qr(), qr_solve(), num::CellList2D< Scalar >::query(), num::CellList3D< Scalar >::query(), num::mpi::rank(), rayleigh_iteration(), num::quantum::reduced_density_matrix(), num::spectral::rfft(), rng_int(), rng_normal(), rng_uniform(), romberg(), row_fiber_sweep(), num::SmallMatrix< M, N >::rows(), num::Circuit::run(), num::Circuit::rx(), num::Circuit::ry(), num::Circuit::rz(), num::VectorField3D::sample(), num::ScalarField3D::sample(), scale(), num::VectorField3D::scale(), num::backends::blas::scale(), num::backends::gpu::scale(), num::backends::omp::scale(), num::backends::seq::scale(), scale(), secant(), num::ScalarField3D::set(), num::Grid3D::set(), simpson(), num::mpi::size(), num::ComplexTriDiag::solve(), num::MagneticSolver::solve_magnetic_field(), num::FieldSolver::solve_poisson(), sparse_matvec(), sph_legendre(), num::SPHKernel< Dim >::Spiky_dW_dr(), num::SPHKernel< Dim >::Spiky_gradW(), num::Circuit::statevector_at(), svd(), svd_truncated(), num::Circuit::swap(), num::CrankNicolsonADI::sweep(), thomas(), num::Grid3D::to_vector(), num::SmallMatrix< M, N >::transposed(), trapz(), num::Circuit::u(), num::markov::umbrella_sweep(), num::markov::umbrella_sweep_prob(), num::RunningStats::update(), num::Circuit::views(), and num::SPHKernel< Dim >::W().
| SolverResult num::jacobi | ( | const Matrix & | A, |
| const Vector & | b, | ||
| Vector & | x, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Jacobi iterative solver for Ax = b.
Updates all components simultaneously using only values from the previous iteration. Converges for strictly diagonally dominant A. Trivially parallelisable with Backend::omp.
| A | Square matrix |
| b | Right-hand side vector |
| x | Solution vector (initial guess on input, solution on output) |
| tol | Convergence tolerance on residual norm (default 1e-10) |
| max_iter | Maximum iterations (default 1000) |
| backend | Execution backend (default: default_backend) |
Definition at line 7 of file jacobi.cpp.
References e, ipow(), and num::BasicVector< T >::size().
| LanczosResult num::lanczos | ( | MatVecFn | matvec, |
| idx | n, | ||
| idx | k, | ||
| real | tol = 1e-10, |
||
| idx | max_steps = 0, |
||
| Backend | backend = Backend::seq |
||
| ) |
Lanczos eigensolver for large sparse symmetric matrices.
| matvec | Callable computing w = A*v (matrix-free) |
| n | Dimension of the problem |
| k | Number of eigenvalues requested (k << n) |
| tol | Convergence on ||A*u - lambda*u|| for each Ritz pair |
| max_steps | Maximum Lanczos steps (default: min(3k, n)) |
| backend | Backend for reorthogonalisation inner products (default: seq) |
Definition at line 30 of file lanczos.cpp.
References beta(), e, eig_sym(), ipow(), and matvec().
Referenced by tdse::TDSESolver::compute_modes().
| void num::laplacian_stencil_2d | ( | const BasicVector< T > & | x, |
| BasicVector< T > & | y, | ||
| int | N | ||
| ) |
Compute y[i,j] = x[i+1,j] + x[i-1,j] + x[i,j+1] + x[i,j-1] - 4*x[i,j] Dirichlet: out-of-bounds neighbors contribute 0. Works for T = real or cplx.
Definition at line 32 of file stencil.hpp.
References ipow().
Referenced by tdse::TDSESolver::compute_energy(), tdse::TDSESolver::compute_modes(), and num::pde::diffusion_step_2d_dirichlet().
| void num::laplacian_stencil_2d_periodic | ( | const BasicVector< T > & | x, |
| BasicVector< T > & | y, | ||
| int | N | ||
| ) |
Same as laplacian_stencil_2d but with periodic (wrap-around) boundaries.
The j boundaries (j=0 and j=N-1) are peeled out of the inner loop so the interior range j=1..N-2 contains no modulo arithmetic and is auto-vectorizable. Row wrapping (i±1) uses precomputed indices.
Definition at line 52 of file stencil.hpp.
References num::BasicVector< T >::data(), and ipow().
Referenced by num::pde::diffusion_step_2d().
LU factorization of a square matrix A with partial pivoting.
Computes P, L, U such that P*A = L*U where: P = permutation matrix encoded in piv L = unit lower triangular (diagonal = 1, stored below diag of LU) U = upper triangular (stored on and above diag of LU)
Partial pivoting (swap largest element to pivot position) ensures |L(i,j)| <= 1 for all i > j, which bounds round-off growth.
Definition at line 21 of file lu.cpp.
References e, ipow(), and num::LUResult::LU.
Referenced by inverse_iteration(), and rayleigh_iteration().
Determinant of A from its LU factorization.
det(A) = det(P)^{-1} * prod(U[i,i]) = (-1)^{swaps} * prod(U[i,i])
Overflow is possible for large n – use log-determinant for stability.
Definition at line 109 of file lu.cpp.
References ipow(), and num::Matrix::rows().
Inverse of A from its LU factorization.
Solves A * X = I column by column. O(n^3) – only use when the full inverse is genuinely needed (prefer lu_solve for specific right-hand sides).
Definition at line 122 of file lu.cpp.
References e, ipow(), lu_solve(), and num::Matrix::rows().
Solve A*X = B for multiple right-hand sides.
Each column of B is solved independently using the same factorization. B and X are nxnrhs matrices (column-major access pattern).
Definition at line 90 of file lu.cpp.
References ipow(), and lu_solve().
Solve A*x = b using a precomputed LU factorization.
Three steps:
Definition at line 67 of file lu.cpp.
References ipow().
Referenced by inverse_iteration(), lu_inv(), lu_solve(), and rayleigh_iteration().
| void num::matadd | ( | real | alpha, |
| const Matrix & | A, | ||
| real | beta, | ||
| const Matrix & | B, | ||
| Matrix & | C, | ||
| Backend | b = default_backend |
||
| ) |
C = alpha*A + beta*B.
Definition at line 105 of file matrix.cpp.
References beta(), blas, blocked, gpu, ipow(), num::backends::blas::matadd(), num::backends::omp::matadd(), num::backends::seq::matadd(), omp, seq, and simd.
C = A * B.
Definition at line 83 of file matrix.cpp.
References blas, blocked, gpu, ipow(), num::backends::blas::matmul(), num::backends::gpu::matmul(), num::backends::omp::matmul(), num::backends::seq::matmul(), num::backends::simd::matmul(), num::backends::seq::matmul_blocked(), omp, seq, and simd.
Referenced by svd_truncated().
C = A * B (cache-blocked)
Divides A, B, C into BLOCKxBLOCK tiles so the working set fits in L2 cache.
| block_size | Tile edge length (default 64). |
Definition at line 117 of file matrix.cpp.
References ipow(), and num::backends::seq::matmul_blocked().
| void num::matmul_register_blocked | ( | const Matrix & | A, |
| const Matrix & | B, | ||
| Matrix & | C, | ||
| idx | block_size = 64, |
||
| idx | reg_size = 4 |
||
| ) |
C = A * B (register-blocked)
Extends cache blocking with a REGxREG register tile inside each cache tile.
| block_size | Cache tile edge (default 64). |
| reg_size | Register tile edge (default 4). |
Definition at line 121 of file matrix.cpp.
References ipow(), and num::backends::seq::matmul_register_blocked().
C = A * B (SIMD-accelerated)
Dispatches at compile time: AVX-256 + FMA on x86, NEON on AArch64, falls back to matmul_blocked if neither is available.
Definition at line 126 of file matrix.cpp.
References ipow(), and num::backends::simd::matmul().
y = A * x
Definition at line 94 of file matrix.cpp.
References blas, blocked, gpu, ipow(), num::backends::blas::matvec(), num::backends::gpu::matvec(), num::backends::omp::matvec(), num::backends::simd::matvec(), num::backends::seq::matvec(), omp, seq, and simd.
Referenced by cg(), expv(), gmres(), inverse_iteration(), lanczos(), power_iteration(), rayleigh_iteration(), and num::FieldSolver::solve_poisson().
y = A * x (SIMD-accelerated)
Definition at line 130 of file matrix.cpp.
References ipow(), and num::backends::simd::matvec().
|
inline |
Compute the negative Laplacian: y = -∆x = (6x[i,j,k] - Σ6 nbrs) / dx² Dirichlet BC: boundary nodes satisfy y[bdry] = x[bdry] (identity row, so that the system is SPD when used with a b=0 boundary RHS).
x and y are flat vectors in Grid3D layout: idx = k*ny*nx + j*nx + i.
Definition at line 109 of file stencil.hpp.
References ipow().
Referenced by num::FieldSolver::solve_poisson().
Newton-Raphson method.
| f | Function |
| df | Derivative of f |
| x0 | Initial guess |
| tol | Convergence tolerance |
| max_iter | Maximum iterations |
Definition at line 25 of file roots.cpp.
Referenced by IsingLattice::mean_field_m().
Euclidean norm sqrt(Sigma |v_i|^2)
Definition at line 97 of file vector.cpp.
References ipow(), and num::BasicVector< T >::size().
| real num::norm | ( | const Vector & | x, |
| Backend | b = default_backend |
||
| ) |
Euclidean norm.
Definition at line 69 of file vector.cpp.
References blas, blocked, gpu, ipow(), num::backends::blas::norm(), num::backends::gpu::norm(), num::backends::seq::norm(), omp, seq, and simd.
Referenced by tdse::TDSESolver::compute_norm(), expv(), num::quantum::norm(), num::quantum::normalize(), num::Histogram::pdf(), and tdse::TDSESolver::renormalize().
| ODEResult num::ode_rk4 | ( | ODERhsFn | f, |
| Vector | y0, | ||
| real | t0, | ||
| real | t1, | ||
| real | h, | ||
| StepCallback | on_step = nullptr |
||
| ) |
Classic 4th-order Runge-Kutta (fixed step).
Definition at line 33 of file ode.cpp.
Referenced by nbody::NBodySim::step().
| ODEResult num::ode_rk45 | ( | ODERhsFn | f, |
| Vector | y0, | ||
| real | t0, | ||
| real | t1, | ||
| real | rtol = 1e-6, |
||
| real | atol = 1e-9, |
||
| real | h0 = 1e-3, |
||
| idx | max_steps = 1000000, |
||
| StepCallback | on_step = nullptr |
||
| ) |
Adaptive Dormand-Prince RK45 with FSAL and PI step-size control.
Advances from t0 to t1, adjusting h so that the mixed absolute/relative error estimate stays below tolerance: err / (atol + |y|·rtol) ≤ 1
| rtol | Relative tolerance (default 1e-6) |
| atol | Absolute tolerance (default 1e-9) |
| h0 | Initial step-size hint (default 1e-3) |
| max_steps | Hard limit on accepted steps (default 1e6) |
| on_step | Optional callback after each accepted step |
| SymplecticResult num::ode_verlet | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| real | t0, | ||
| real | t1, | ||
| real | h, | ||
| SymplecticCallback | on_step = nullptr |
||
| ) |
Velocity Verlet — 2nd-order symplectic, 1 force evaluation per step.
Algorithm (kick-drift-kick): a_n = accel(q_n) q_{n+1} = q_n + h·v_n + h²/2 · a_n a_{n+1} = accel(q_{n+1}) v_{n+1} = v_n + h/2 · (a_n + a_{n+1})
Acceleration is cached between steps (FSAL — first same as last).
Definition at line 154 of file ode.cpp.
References e, ipow(), and num::BasicVector< T >::size().
Referenced by nbody::NBodySim::step().
| SymplecticResult num::ode_yoshida4 | ( | AccelFn | accel, |
| Vector | q0, | ||
| Vector | v0, | ||
| real | t0, | ||
| real | t1, | ||
| real | h, | ||
| SymplecticCallback | on_step = nullptr |
||
| ) |
Yoshida 4th-order symplectic — 3 force evaluations per step.
Composes three leapfrog sub-steps with Yoshida (1990) coefficients: w₁ = 1 / (2 − ∛2), w₀ = 1 − 2w₁ c₁=c₄=w₁/2, c₂=c₃=(w₀+w₁)/2, d₁=d₃=w₁, d₂=w₀
Drift-kick sequence per step: q += c₁h·v → v += d₁h·a(q) → q += c₂h·v → v += d₂h·a(q) → q += c₃h·v → v += d₃h·a(q) → q += c₄h·v
Preferred over Verlet when accuracy matters and force is cheap.
Definition at line 195 of file ode.cpp.
References e, ipow(), and num::BasicVector< T >::size().
|
constexprnoexcept |
Definition at line 68 of file small_matrix.hpp.
|
constexprnoexcept |
Definition at line 64 of file small_matrix.hpp.
References ipow().
| PowerResult num::power_iteration | ( | const Matrix & | A, |
| real | tol = 1e-10, |
||
| idx | max_iter = 1000, |
||
| Backend | backend = default_backend |
||
| ) |
Power iteration – finds the eigenvalue largest in absolute value.
| A | Square matrix (need not be symmetric) |
| tol | Tolerance on eigenvalue change between iterations |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec and dot |
Definition at line 11 of file power.cpp.
References dot(), e, ipow(), matvec(), and num::detail::normalise().
Pretty-print a statevector, hiding amplitudes below threshold.
Definition at line 416 of file circuit.cpp.
References ipow().
n-qubit Quantum Fourier Transform circuit Applied to |0...0> unless you prepend state-preparation gates.
Definition at line 403 of file circuit.cpp.
References num::Circuit::cp(), num::Circuit::h(), ipow(), and num::Circuit::swap().
QR factorization of an mxn matrix A (m >= n) via Householder reflections.
Each Householder step k:
After min(m-1, n) steps, the matrix is upper triangular. Q is reconstructed by accumulating the reflectors in reverse order.
Householder QR is backward-stable (||deltaA||/||A|| = O(u) for machine unit u), unlike classical Gram-Schmidt which is only forward-stable.
Definition at line 41 of file qr.cpp.
Referenced by svd_truncated().
Solve the least-squares problem min ||A*x - b||_2.
Algorithm:
For square non-singular A this returns the exact solution. For overdetermined A (m > n) this returns the minimum-residual solution.
Definition at line 119 of file qr.cpp.
References ipow().
| PowerResult num::rayleigh_iteration | ( | const Matrix & | A, |
| const Vector & | x0, | ||
| real | tol = 1e-10, |
||
| idx | max_iter = 50, |
||
| Backend | backend = default_backend |
||
| ) |
Rayleigh quotient iteration – cubically convergent.
Updates the shift sigma = vᵀAv at every step -> fresh LU each iteration.
| A | Symmetric matrix |
| x0 | Starting vector (determines which eigenvalue is found) |
| tol | Tolerance on residual ||A*v - lambda*v|| |
| max_iter | Maximum iterations |
| backend | Backend forwarded to matvec, dot, axpy, norm |
Definition at line 93 of file power.cpp.
References dot(), ipow(), lu(), lu_solve(), matvec(), and num::detail::normalise().
Normal (Gaussian) sample with given mean and standard deviation.
Definition at line 289 of file math.hpp.
References ipow().
Referenced by svd_truncated().
Romberg integration (Richardson extrapolation on trapezoidal rule)
| f | Integrand |
| a | Lower bound |
| b | Upper bound |
| tol | Convergence tolerance |
| max_levels | Maximum refinement levels |
Definition at line 98 of file quadrature.cpp.
References ipow().
| void num::row_fiber_sweep | ( | BasicVector< T > & | data, |
| int | N, | ||
| F && | f | ||
| ) |
For each row i in [0,N), extract the y-direction fiber data[i, 0..N-1] into a std::vector<T>, call f(fiber), then write back.
Use for ADI / Crank-Nicolson sweeps along y.
Definition at line 91 of file stencil.hpp.
References ipow().
Referenced by num::CrankNicolsonADI::sweep().
v *= alpha
Definition at line 83 of file vector.cpp.
References ipow(), and num::BasicVector< T >::size().
| void num::scale | ( | Vector & | v, |
| real | alpha, | ||
| Backend | b = default_backend |
||
| ) |
v *= alpha
Definition at line 27 of file vector.cpp.
References blas, blocked, gpu, ipow(), omp, num::backends::blas::scale(), num::backends::gpu::scale(), num::backends::omp::scale(), num::backends::seq::scale(), seq, and simd.
Referenced by cg(), cg_matfree(), tdse::TDSESolver::renormalize(), num::VectorField3D::scale(), and tdse::TDSESolver::set_potential().
Simpson's 1/3 rule with n panels (n must be even)
| backend | Backend::omp parallelises the panel sum |
Definition at line 25 of file quadrature.cpp.
References ipow().
y = A * x
Definition at line 110 of file sparse.cpp.
References ipow(), and num::BasicVector< T >::size().
Referenced by IsingLattice::energy_per_spin(), expv(), and gmres().
| SVDResult num::svd | ( | const Matrix & | A, |
| Backend | backend = Backend::seq, |
||
| real | tol = 1e-12, |
||
| idx | max_sweeps = 100 |
||
| ) |
Full SVD of an mxn matrix via one-sided Jacobi.
Returns the economy SVD: r = min(m,n), U is mxr, S has r elements, Vt is rxn.
| A | Input matrix (not modified) |
| backend | Backend for column-update inner loops (Backend::omp parallelises) |
| tol | Convergence: stop when max |col_p * col_q| / (||col_p|| ||col_q||) < tol |
| max_sweeps | Safety cap on Jacobi sweeps |
Definition at line 39 of file svd.cpp.
References beta(), e, ipow(), and zeta().
Referenced by svd_truncated().
| SVDResult num::svd_truncated | ( | const Matrix & | A, |
| idx | k, | ||
| Backend | backend = default_backend, |
||
| idx | oversampling = 10, |
||
| Rng * | rng = nullptr |
||
| ) |
Randomized truncated SVD – top-k singular triplets.
Efficient when k << min(m,n). The two dominant costs – Y = A*Omega and B = Qᵀ*A – are dispatched via the given backend.
| A | Input matrix |
| k | Number of singular values/vectors to compute |
| backend | Backend for internal matmul calls (default: default_backend) |
| oversampling | Extra random vectors for accuracy (default 10) |
| rng | Random number generator (default: seeded from hardware) |
Definition at line 143 of file svd.cpp.
References ipow(), matmul(), qr(), rng_normal(), and svd().
| void num::thomas | ( | const Vector & | a, |
| const Vector & | b, | ||
| const Vector & | c, | ||
| const Vector & | d, | ||
| Vector & | x, | ||
| Backend | backend = Backend::seq |
||
| ) |
Thomas algorithm (LU for tridiagonal systems), O(n).
| a | Sub-diagonal, size n-1 |
| b | Main diagonal, size n |
| c | Super-diagonal, size n-1 |
| d | Right-hand side vector, size n |
| x | Solution vector (output), size n |
| backend | Backend::seq (CPU) or Backend::gpu (CUDA batched, batch=1) |
Definition at line 7 of file thomas.cpp.
References cg(), gpu, num::BasicVector< T >::gpu_data(), ipow(), num::BasicVector< T >::size(), num::cuda::thomas_batched(), num::BasicVector< T >::to_cpu(), and num::BasicVector< T >::to_gpu().
Trapezoidal rule with n panels.
| backend | Backend::omp parallelises the panel sum |
Definition at line 14 of file quadrature.cpp.
References ipow().
Best backend for memory-bound vector ops: blas > omp > blocked.
Definition at line 65 of file policy.hpp.
Definition at line 33 of file policy.hpp.
Definition at line 31 of file policy.hpp.
Automatically selected at configure time: blas > blocked. Used as the default parameter for all operations.
Definition at line 57 of file policy.hpp.
Definition at line 41 of file math.hpp.
Referenced by autocorr_time(), brent(), cg(), cg_matfree(), eig_sym(), eig_sym(), num::quantum::entanglement_entropy(), expv(), gauss_seidel(), gmres(), inverse_iteration(), jacobi(), lanczos(), lu(), lu_inv(), newton(), num::detail::normalise(), ode_euler(), ode_rk4(), ode_rk45(), ode_verlet(), ode_yoshida4(), power_iteration(), qr(), secant(), IsingLattice::set_temperature(), svd(), and IsingLattice::temperature().
Definition at line 35 of file policy.hpp.
True when a BLAS/cblas library was found at configure time.
Definition at line 39 of file policy.hpp.
True when OpenMP was found at configure time.
Definition at line 47 of file policy.hpp.
Definition at line 34 of file policy.hpp.
Golden ratio.
Definition at line 42 of file math.hpp.
Referenced by num::MagneticSolver::current_density(), ellint_Ei(), ellint_F(), ellint_Pi_inc(), num::quantum::gate_u(), num::FieldSolver::gradient(), gradient_3d(), num::FieldSolver::solve_poisson(), and num::Circuit::u().
Definition at line 40 of file math.hpp.
Referenced by num::quantum::gate_t(), and num::quantum::gate_tdg().
Definition at line 30 of file policy.hpp.
Definition at line 32 of file policy.hpp.