numerics

C++ library for numerical linear algebra, ODE/PDE integration, spectral methods, and statistical simulation.

Iterative solvers (CG, GMRES, Lanczos, expv) share a common kernel layer: BLAS/SIMD primitives, tag-dispatched array operations, and Modified Gram-Schmidt / Arnoldi routines.

Install

Drop this in your CMake configuration:

include(FetchContent)
FetchContent_Declare(
    numerics
    GIT_REPOSITORY https://github.com/AdityaDendukuri/numerics
    GIT_TAG        main
)
FetchContent_MakeAvailable(numerics)
target_link_libraries(my_app PRIVATE numerics)

Optional system packages: libopenblas-dev, libfftw3-dev, OpenMP, CUDA. See README: Dependencies.

Examples

#include "numerics.hpp"

int main() {
    const double sigma = 10.0, rho = 28.0, beta = 8.0 / 3.0;

    auto lorenz = [&](double, const num::Vector& s, num::Vector& ds) {
        ds[0] = sigma * (s[1] - s[0]);
        ds[1] = s[0] * (rho - s[2]) - s[1];
        ds[2] = s[0] * s[1] - beta * s[2];
    };

    // simulation observer function to store (x, z) each step
    num::Series xz;
    auto observer = [&](double, const num::Vector& y) {
        xz.store(y[0], y[2]);
    };

    num::solve(
        num::ODEProblem{lorenz, {1.0, 0.0, 0.0}, 0.0, 50.0},
        num::RK45{.rtol = 1e-8, .atol = 1e-10, .max_steps = 2000000},
        observer);

    num::plt::plot(xz);
    num::plt::title("Lorenz attractor (sigma=10, rho=28, beta=8/3)");
    num::plt::xlabel("x");
    num::plt::ylabel("z");
    num::plt::savefig("lorenz.png");
}
Lorenz attractor phase portrait (x vs z)
Lorenz attractor — adaptive RK45 via num::solve + num::RK45
#include "plot/plot.hpp"
#include "spectral/fft.hpp"

int main() {
    const int N = 1024; const double fs = 1000.0;  // N samples at fs Hz

    num::Vector x(N);  // real-valued time-domain signal
    for (int i = 0; i < N; ++i)
        x[i] = std::sin(2*M_PI*50.0*i/fs)
             + 0.5*std::sin(2*M_PI*120.0*i/fs);

    // rfft output has N/2+1 bins: real input has conjugate symmetry,
    // so only the non-redundant half [0 .. N/2] is stored
    num::CVector X(N/2 + 1);  // complex frequency-domain coefficients
    num::spectral::rfft(x, X);

    num::Series signal, spectrum;  // lists of (x, y) pairs for plotting
    for (int i = 0; i < 100; ++i)
        signal.emplace_back(i/fs*1000.0, x[i]);  // first 100 ms
    for (int k = 0; k < (int)X.size(); ++k)
        spectrum.emplace_back(k*fs/N, std::abs(X[k]));  // bin k maps to freq k*fs/N Hz

    num::plt::subplot(2, 1);
    num::plt::plot(signal);
    num::plt::title("Signal: 50 Hz + 0.5 * 120 Hz");
    num::plt::xlabel("time (ms)");  num::plt::ylabel("amplitude");

    num::plt::next();
    num::plt::plot(spectrum);
    num::plt::title("FFT power spectrum");
    num::plt::xlabel("frequency (Hz)");  num::plt::ylabel("|X[k]|");

    num::plt::savefig("fft_demo.png");
}
FFT power spectrum
FFT power spectrumnum::spectral::rfft; peaks at 50 Hz and 120 Hz
#include "numerics.hpp"
#include "spatial/pbc_lattice.hpp"
#include "stochastic/boltzmann_table.hpp"

int main() {
    const int N = 64, NN = N*N;
    num::PBCLattice2D   nbr(N);
    std::vector<double> spins(NN);
    std::mt19937        rng(42);

    num::Series curve;
    for (double T = 0.5; T <= 4.01; T += 0.1) {
        std::fill(spins.begin(), spins.end(), 1.0);
        double beta = 1.0 / T;

        auto accept = [&](int i) {
            double ns = spins[nbr.up[i]] + spins[nbr.dn[i]]
                      + spins[nbr.lt[i]] + spins[nbr.rt[i]];
            return num::markov::boltzmann_accept(2.0*spins[i]*ns, beta);
        };
        auto flip    = [&](int i) { spins[i] = -spins[i]; };
        auto measure = [&]() {
            return std::abs(std::accumulate(spins.begin(), spins.end(), 0.0) / NN);
        };

        double m = num::solve(
            num::MCMCProblem{accept, flip, NN},
            num::Metropolis{.equilibration=2000, .measurements=500},
            measure, rng);

        curve.store(T, m);
    }
    num::plt::plot(curve);
    num::plt::xlabel("T");  num::plt::ylabel("|<m>|");
    num::plt::savefig("ising_demo.png");
}
Ising magnetization vs temperature
Ising phase transitionnum::solve + num::Metropolis; |m| drops at Tc ≈ 2.27
#include "numerics.hpp"
#include <string>

using namespace num;

int main() {
  const int N = 64;
  const double kappa = 1.0, sigma = 0.06, T_end = 0.012;
  const double h = 1.0 / (N + 1);
  const double dt = 4.0 * h * h;   // 16x larger than explicit limit
  const int nstep = static_cast<int>(T_end / dt) + 1;
  const double coeff = kappa * dt / (h * h);

  // A = I - coeff*L  (sparse SPD -- CG converges in O(N) iterations)
  Grid2D       grid{N, h};
  SparseMatrix A      = pde::backward_euler_matrix(grid, coeff);
  LinearSolver solver = make_cg_solver(A);

  ScalarField2D u0(grid, [=](double x, double y) {
    return gaussian2d(x, y, 0.5, 0.5, sigma);
  });
  ScalarField2D u = u0;
  solve(u, BackwardEuler{.solver=solver, .dt=dt, .nstep=nstep});

  plt::subplot(1, 2);
  plt::heatmap(u0);  plt::title("t = 0");
  plt::next();
  plt::heatmap(u);   plt::title("t = " + std::to_string(T_end).substr(0, 6));
  plt::savefig("heat_demo.png");
}
2D heat equation — Gaussian spreading
2D heat equation — implicit backward Euler via sparse CG; num::solve + num::BackwardEuler

Some Simulations Built with numerics and Raylib

2D SPH Fluid — dam-break, hot/cold spheres, heat transfer, falling rigid body
3D SPH Fluid — dual temperature hoses, free-orbit camera
Navier-Stokes — Kelvin-Helmholtz instability, particle tracers
Electromagnetism — FEM potential solve, B-field, orbiting dipole
Schrödinger Equation — double-slit diffraction, Strang splitting
Quantum Circuits — Bell, GHZ, Grover, teleportation, QFT₃
Ising Nucleation — free-energy barrier crossing (F=0.1, T<Tc)
Ising Ordering — spontaneous symmetry breaking from random init
Galaxy Collapse — 50 random bodies under gravity; bodies merge on contact, growing in mass and radius until a dominant body with moons remains. Integrated with symplectic velocity Verlet.

Library Modules

kernel

Performance substrate. Tier 1: raw BLAS/SIMD dot/axpy. Tier 2: tag-dispatched array/reduce (seq_t/par_t). Tier 3: LinearOp, MGS orthogonalization, Arnoldi step — shared by GMRES, Lanczos, and expv.

core

Vector, Matrix, SparseMatrix, BandedMatrix. High-level backend dispatch (seq, SIMD, BLAS, OpenMP, CUDA) for matmul and matvec.

fields

Grid2D, Grid3D — geometry only. ScalarField2D/3D, VectorField3D — field storage over a grid.

solve

num::solve(problem, algorithm) dispatcher. ODEProblem, MCMCProblem, algorithm tags (RK45, BackwardEuler, Metropolis…).

factorization

LU, QR, Thomas tridiagonal. Direct solvers for dense and structured systems.

solvers

Conjugate gradient, matrix-free CG, restarted GMRES, Jacobi, Gauss-Seidel. LinearSolver universal callable.

eigen

Power iteration, inverse iteration, Rayleigh quotient, Lanczos, dense Jacobi sweeps.

svd

Full SVD and randomized truncated SVD for large matrices.

spectral

Radix-2 FFT/IFFT, real FFT, precomputed plans. Optional FFTW3 backend.

analysis

Quadrature (trapz, Simpson, Gauss-Legendre, Romberg) and root-finding (bisection, Newton, Brent).

markov

Metropolis sweep, precomputed-probability variant, umbrella sampling. Boltzmann acceptance table for discrete ΔE.

stats

Online Welford mean/variance, histogram with WHAM reweighting, autocorrelation time.

meshfree

CellList2D/3D, Verlet list, SPHKernel<Dim> (2D/3D). Particle-based spatial structures for Lagrangian methods.

pde

Stencils (2D/3D Laplacians, periodic/Dirichlet), backward-Euler matrix builder, CrankNicolsonADI, explicit diffusion steps.

ode

Euler, RK4, adaptive RK45. Velocity Verlet and Yoshida4 symplectic integrators. Implicit ode::advance for backward Euler steps.

quantum

Statevector simulation, quantum circuit builder, EDMD for unitary evolution.

physics apps

SPH 2D/3D, Navier-Stokes, TDSE, EM fields, Ising MC, N-body gravity. Interactive simulations using the per-step low-level API.

Implementation Notes

cache blocking

L1/L2 tile sizes, blocked matmul, 6× speedup over naive on large matrices without any parallelism.

register blocking

Micro-kernel design, register reuse, and the limits of scalar code before SIMD is required.

SIMD vectorization

Explicit AVX-256 and ARM NEON intrinsics applied to the matmul micro-kernel; additional 2× over register blocking.

NS performance

Roofline analysis of the Navier-Stokes projection solver: advect, pressure CG, project.

stencil / HOF

Higher-order grid functions: 2D/3D Laplacians, fiber sweeps for ADI splitting, gradient, divergence, curl on Grid3D.

3D fields & solvers

ScalarField3D, VectorField3D, FieldSolver (Poisson/gradient/curl), MagneticSolver (vector Poisson → B = ∇×A).

SPH kernels

SPHKernel<Dim> unified 2D/3D template: cubic spline density, Morris Laplacian, spiky pressure gradient.

PBC lattice

PBCLattice2D: precomputed periodic-boundary neighbor arrays for 2D square lattices; eliminates modulo in hot paths.

connected components

Template BFS labelling with pre-allocated flat queue. ClusterResult tracks largest component for Ising nucleation.

Boltzmann table

boltzmann_accept and make_boltzmann_table: precompute min(1,exp(-βΔE)) for discrete energy sets.

FFT

Cooley-Tukey radix-2, twiddle factors, real FFT, FFTW3 backend, and plan reuse.

PDE module

CrankNicolsonADI Strang splitting, explicit diffusion steps, 3D field solvers, stencil operators.

For algorithm theory (LU/QR, CG, GMRES, Lanczos, eigenvalue methods, quadrature, ODE integrators) see the companion notes: Scientific Computing ↗

Benchmarks

Generated by cmake --build build --target report. Build environment, test results, and throughput tables per module.

View Report →