Brownian motion is continuous but nowhere differentiable. This simple fact creates a deep problem for anyone who wants to write differential equations driven by random noise. Classical Riemann–Stieltjes integration requires the integrator to have bounded variation, but Brownian paths have infinite variation on every interval. To make sense of an equation like dXt = μ(Xt) dt + σ(Xt) dBt, where B_t is Brownian motion, mathematicians had to invent entirely new ways of defining integrals and differentials. The result is not one monolithic theory but a family of frameworks, each built around a different compromise between analytic power, probabilistic structure, and geometric naturalness.
Kiyoshi Itô’s breakthrough in the 1940s was to define the integral ∫ f(Bs) dBs as a limit of forward-looking Riemann sums. By evaluating the integrand at the left endpoint of each subinterval, the Itô integral preserves a crucial probabilistic property: it produces a martingale. If the integrand is adapted (depends only on information available up to the current time), the integral is a martingale, and its expectation is always zero. This martingale property made Itô calculus the natural language for stochastic differential equations (SDEs) in finance, filtering, and physics.
The cost of the forward-looking choice is that the chain rule of ordinary calculus no longer holds. Instead, the Itô formula includes an extra second-order term: for a smooth function f, df(Bt) = f'(Bt) dBt + (1/2) f''(Bt) dt. The (1/2) f'' dt term, often called the Itô correction, reflects the quadratic variation of Brownian motion. This modified calculus is powerful but can be cumbersome in geometric settings where coordinate changes are frequent.
Ruslan Stratonovich proposed an alternative integral in the 1950s that evaluates the integrand at the midpoint of each subinterval. The Stratonovich integral obeys the ordinary chain rule: df(Bt) = f'(Bt) ∘ dB_t, with no correction term. This makes Stratonovich calculus coordinate-invariant, a decisive advantage for stochastic analysis on manifolds and for problems in geometric mechanics.
The two calculi are not in conflict. They are related by a simple conversion formula: the Stratonovich integral equals the Itô integral plus half the quadratic covariation of the integrand and the integrator. An SDE written in one form can be rewritten in the other by adjusting the drift coefficient. The choice between them is a matter of convenience: Itô calculus preserves martingales and is standard in finance; Stratonovich calculus preserves the chain rule and is standard in geometry and physics. Both frameworks remain active, and practitioners routinely translate between them.
While Itô and Stratonovich focused on Brownian motion, martingale theory, developed by Joseph Doob and others in the 1950s, provided a far more general class of integrators. A martingale is a stochastic process whose future expectation given the present equals its current value. The Doob–Meyer decomposition showed that any process that is a semimartingale (the sum of a martingale and a process of bounded variation) can serve as an integrator. This absorbed Itô calculus into a broader theory: the Itô integral could now be defined for any semimartingale, not just Brownian motion.
Martingale theory also introduced powerful tools such as optional stopping, the martingale representation theorem, and inequalities like Doob’s maximal inequality. These became the backbone of stochastic analysis, providing the probabilistic infrastructure that later frameworks would rely on. The semimartingale framework remains the most widely used setting for stochastic integration today.
By the 1970s, the need for a rigorous measure-theoretic foundation for stochastic integration had become pressing. The general theory of processes, developed by Claude Dellacherie, Paul-André Meyer, and others, systematized the concepts of stopping times, filtrations, and predictable and optional sigma-algebras. It provided the precise definitions needed to know which processes are integrable and when an integral is well-defined.
This framework did not compete with Itô calculus or martingale theory; it provided the infrastructure that made them fully rigorous. The predictable sigma-algebra, for instance, is the natural domain for integrands in the stochastic integral, ensuring that the integrand cannot anticipate future randomness. The optional sigma-algebra governs processes that are adapted and right-continuous. The general theory also gave a clean treatment of local martingales and semimartingales, clarifying the conditions under which the Doob–Meyer decomposition exists. Today, every serious treatment of stochastic integration builds on this measure-theoretic foundation.
Paul Malliavin introduced a calculus of variations on Wiener space in the 1970s, opening a new direction. While Itô calculus works only with adapted processes (those that do not peek into the future), Malliavin calculus defines derivatives of functionals of Brownian paths, such as the maximum of a path or the exit time from an interval. These derivatives are random variables that measure how a functional changes when the underlying path is perturbed.
The key tool is an integration-by-parts formula on path space, which allows one to prove the existence of smooth densities for solutions of SDEs under the Hörmander condition—a result that had resisted purely probabilistic methods. Malliavin calculus also found applications in mathematical finance for computing Greeks (sensitivities of option prices) and in stochastic analysis for studying the regularity of laws. It complements martingale theory by handling non-adapted functionals and by providing analytic tools that go beyond the semimartingale framework.
Terry Lyons’s rough path theory, developed in the 1990s, represents a more radical departure. Instead of relying on probabilistic structure (filtrations, martingales, expectations), rough path theory takes a purely pathwise approach. The idea is to lift a rough path, such as a Brownian path, to a higher-dimensional object that includes not only the path itself but also its iterated integrals. Once the path is lifted, integration and differential equations become deterministic operations that work for any path with finite p-variation, for p < 3.
This framework extends stochastic analysis beyond semimartingales to signals like fractional Brownian motion with Hurst parameter H > 1/4, which are not semimartingales and cannot be handled by Itô calculus. Rough path theory also provides robust numerical methods and a natural setting for studying the stability of SDEs under perturbations of the driving signal. It coexists with the probabilistic frameworks rather than replacing them: for semimartingales, the rough path lift can be constructed using the Itô or Stratonovich integral, and the resulting theory reproduces the classical results.
Today, all six frameworks remain active, and their division of labor reflects their different strengths. Itô calculus and the semimartingale framework dominate finance, filtering, and most applied SDE work. Stratonovich calculus is standard in stochastic geometry and physics. The general theory of processes provides the measure-theoretic foundation that underlies all of these. Malliavin calculus is the tool of choice for regularity of laws and sensitivity analysis. Rough path theory has become essential for problems involving non-semimartingale signals and for numerical analysis.
What do these frameworks agree on? They all accept that integration against irregular paths requires a departure from classical Riemann–Stieltjes theory, and they all rely on some form of quadratic variation or p-variation to control the roughness. They also agree that the Itô formula (or its Stratonovich variant) is the correct replacement for the chain rule.
The deepest disagreement concerns the role of probability. The Itô–martingale tradition treats probability as essential: filtrations, adaptedness, and expectations are built into the definition of the integral. Rough path theory, by contrast, treats probability as optional: the integral is defined pathwise, and probabilistic structure enters only when one wants to average over paths. This divide is not a weakness but a source of richness. Each framework illuminates a different facet of the same phenomenon, and the choice between them depends on the problem at hand.