paulfharrison 2 hours ago

A further step is Langevin Dynamics, where the system has damped momentum, and the noise is inserted into the momentum. This can be used in molecular dynamics simulations, and it can also be used for Bayesian MCMC sampling.

Oddly, most mentions of Langevin Dynamics in relation to AI that I've seen omit the use of momentum, even though gradient descent with momentum is widely used in AI. To confuse matters further, "stochastic" is used to refer to approximating the gradient using a sub-sample of the data at each step. You can apply both forms of stochasticity at once if you want to!

  • zzazzdsa an hour ago

    The momentum analogue for Langevin is known as underdamped Langevin, which if you optimize the discretization scheme hard enough, converges faster than ordinary Langevin. As for your question, your guess is as good as mine, but I would guess that the nonconvexity of AI applications causes problems. Sampling is a hard enough problem already in the log-concave setting…

markisus 5 hours ago

Here is a corresponding introduction I found very useful, for readers with advanced undergraduate / graduate level math knowledge.

https://almostsuremath.com/stochastic-calculus/

  • hrududuu 2 hours ago

    Great resource. This was my area of graduate study, and I would say this material is quite hard, in the beginner to advanced PhD range.

    And this inspiring textbook I think has high overlap with these topics: https://www.amazon.com/Stochastic-Integration-Differential-E...

    • 3abiton 14 minutes ago

      I was traumatised by fluid dynamics course back in the days before youtube tutorials were a thing and we had to rely on a good teacher to explain some concepts.

    • markisus an hour ago

      Yes, by advanced undergraduate, I meant very advanced undergraduate. But when I was in undergrad I always heard about some students like this who were off in the graduate classes. And then in grad school, there was even a high school student in my Algebra course who managed to correct the professor on some technical issue of group theory. So I don't assume you have to be a PhD to work through this material.

Daniel_Van_Zant 7 hours ago

Is stochastic calculus something that requires a computer to stimulate many possible unfolding of events, or is there a more elegant mathematical way to solve for some of the important final outputs and probability distributions if you know the distribution of dW? This is an awesome article. I've seen stochastic calculus before but this is the first time I really felt like I started to grok it.

  • sfpotter 4 hours ago

    In case the other responses to your question are a little difficult to parse, and to answer your question a little more directly:

    - Usually, you will only get analytic answers for simple questions about simple distributions.

    - For more complicated problems (either because the question is complicated, or the distribution is complicated, or both), you will need to use numerical methods.

    - This doesn't necessarily mean you'll need to do many simulations, as in a Monte Carlo method, although that can be a very reasonable (albeit expensive) approach.

    More direct questions about certain probabilities can be answered without using a Monte Carlo method. The Fokker-Planck equation is a partial differential equation which can be solved using a variety of non-Monte Carlo approaches. The quasipotential and committor functions are interesting objects which come up in the simulation of rare events that can also be computed "directly" (i.e., without using a Monte Carlo approach). The crux of the problem is that applying standard numerical methods to the computation of these objects faces the curse of dimensionality. Finding good ways to compute these things in the high-dimensional case (or even the infinite-dimensional case) is a very hot area of research in applied mathematics. Personally, I think unless you have a very clear physical application where the mathematics map cleanly onto what you're doing, all this stuff is probably a bit of a waste of time...

    • Daniel_Van_Zant an hour ago

      Thanks for the explanation this was very helpful. You've given me a whole new list of stuff to Google. The quasipotential/comittor functions especially seem quite interesting although I'm having a bit of trouble finding good resources on them.

  • kkylin 6 hours ago

    It depends a bit on exactly what you want to calculate, but in general things like the probability density function of the solution of a stochastic differential equation (SDE) at time t satisfies a partial differential equation (PDE) that is first order in time and second order in space [0]. (This PDE is known to physicists as the Fokker-Planck equation and to mathematicians as the Kolmogorov forward equation.) Except in special examples, the PDE will not have exact analytical solutions, and a numerical solution is needed. Such a numerical solution will be very expensive in high dimensions, however, so in high-dimensional problems it is cheaper to solve the SDE and do Monte Carlo sampling, rather than try to solve the PDE.

    Edit: sometimes people are interested in other types of questions, for example the solution when certain random events occur. Analogous comments apply. Also, while stochastic calculus is very useful for working with SDEs, if your interest is other types of Markov (or even non-Markov) processes you may need other tools.

    Edit again: as another commenter mentioned, in special cases the SDE itself may also have exact solutions, but in general not.

    [0] This statement is specific to stochastic differential equations, i.e., a differential equation with (gaussian) white noise forcing. For other types of stochastic processes, e.g., Markov jump processes, the evolution equation for distributions have a different form (but some general principles apply to both, e.g., forms of the Chapman-Kolmogorov equation, etc).

  • FabHK 6 hours ago

    Certain simple stochastic differential equations can be solved explicitly analytically (like some integrals and simple ordinary differential equations can be solved explicitly), for example the classic Black Scholes equation. More complicated ones typically can't be solved in that way.

    What one often wishes to have is the expectation of a function of a stochastic process at some point, and what can be shown is that this expectation obeys a certain (deterministic) partial differential equation. This then can be solved using numerical PDE solvers.

    In higher dimensions, though, or if the process is highly path-dependent (not Markovian), one resorts to Monte Carlo simulation, which does indeed simulate "many possible unfolding of events".

  • LeonardoTolstoy 7 hours ago

    It has been a while since I studied along these lines (stochastic chemical reaction simulations in my case) but I think the answer is often yes, but not always (I don't think). A random walk for example will be a normal distribution (and you know the mean, and you know the variance is going to infinity), so I do think in that case you end up with an elegant analytical solution if I'm understanding correctly as the inputs can determine the function the variance follows through time.

    But often no, you need to run a stochastic algorithm (e.g. Gillespie's algorithm in the case of simple stochastic chemical kinetics) as there will be no analytical solution.

    Again it has been a while though.

    • yoyoma1234 7 hours ago

      For normal distributions I think do - black scholes is an analytical solution to option pricing. Been a while since I studied stochastic calculus

      I question why this is the second highest article on hacker news currently, can’t imagine many people reading this website are REALLY in this field or a related one, or if it’s just signaling like saying you have a copy of Knuths books or that famous lisp one

      • PhilipRoman 7 hours ago

        This is one of those archetypal submissions on HN: mathematics (preferably pure, using the word "calculus" outside of integrals/derivatives gives additional points), moderately high number of upvotes, very few comments. Pretty much the opposite of political posts, where everyone can "contribute" to the discussion.

      • magicalhippo 6 hours ago

        I upvote so it sticks around longer, so it has a better chance of generating interesting comments.

        I also upvote because I find it interesting to learn about stuff I didn't know about. I might not understand it, but I do like the exposure regardless.

      • nh23423fefe 6 hours ago

        I upvote good things even if i dont read because i dont want to spend all my energy reacting to trash politics posts. cut away bad, promote good

  • anvuong 6 hours ago

    Depends on what you want to know. If you want to get some trajectories then simulation of the stochastic differential equation is required. But if you just want to know the statistics of the paths, then in many cases you can write and try to solve the Fokker-Planck equation, which is a partial differential equation, to get the path density.

janalsncm 4 hours ago

Here’s an example where I ran into this recently.

Let’s say we play a “game”. Draw a random number A between 0 and 1 (uniform distribution). Now draw a second number B from the same distribution. If A > B, draw B again (A remains). What is the average number of draws required? (In other words, what is the average “win streak” for A?)

The answer is infinity. The reason is, some portion of the time A will be extremely high and take millions of draws to beat.

  • drdeca 3 hours ago

    Showing the calculation you described:

    If p is the value drawn for A, then each time B is drawn, the probability that B>A is (1-p), So, the chance that B is drawn n times before being less than or equal to A is, p^(n-1) (1-p) (a geometric distribution). The expected number of draws is then (1/p) . Then, E[draws] = E[E[draws|A=p]] = \int_0^1 E[draws|A=p] dp = \int_0^1 (1/p) dp, which diverges to infinity (as you said).

    (I wasn’t doubting you, I just wanted to see the calculation.)

  • RandomBK 3 hours ago

    The way the question was framed, it was ambiguous whether "draw again" only applied to B, or whether A would draw again as well. I'm assuming the 'infinity' answer applies only to the former case?

    • janalsncm 3 hours ago

      Sorry, we only draw B again.

  • zzazzdsa 4 hours ago

    Does this really require stochastic calculus to prove? This should just be a standard integration, based on the fact that the expected number of samples required for fixed A being 1/(1-A).

robwwilliams 6 hours ago

Question for HN readers: We have defined about 50 spots (loci) in the mouse genome that contain DNA differences that modulate mortality rates. Most of them have complex age-dependent “actuarial” effects. We would like to predict age at death.

Would stochastic calculus be a useful approach in actuarial prediction of life expectancies of mice?

(And this is why I am pleased to see this high on HN.)

  • whatshisface 5 hours ago

    Stochastic calculus is like ordinary calculus in that it is most useful when one time is like another except for a few variables that describe a state, and least useful when one time is unlike another.

    Because you have as many questions (loci) as you have segments that you can reasonably expect to divide time into (changing the time of death by 1/50th of a mouse lifespan would be impossible to detect unless I am wrong?), and because the time intervals are not that numerous, and also because you wouldn't really have a model for the interaction of the state variables and would be using model-free statistical methods, I think you would get all of the value there is to get out of noncontinuous methods.

  • etiam 5 hours ago

    I'm not prepared to say "no", and as has been noted already, it depends on the application, but from your description it seems to me more like a task for Bayesian statistics organized on graphs (the nodes & vertices kind).

    • btown 4 hours ago

      And going beyond this: my layman's understanding of biology is that the way in which genes are expressed can be highly nonlinear and modulated by all sorts of different pathways. If you have some clarity on how these pathways work, probabilistic programming might be a helpful tool here in a Bayesian context.

      It's been a number of years since I've looked at these things, but https://www.theactuary.com/2024/04/04/bayesian-revolution and https://arxiv.org/abs/2310.14888 are recent articles that may be relevant.

  • nextos 4 hours ago

    I was coming here to say this is a survival analysis problem, and thus a different branch of probability and statistics. However, you can also frame it as a stochastic process if you have extra epigenetic data that is associated to those 50 DNA loci or some genes they regulate.

    For example, your DNA loci of interest could have a state (methylated or unmethylated). And you could come up with a stochastic process where death occurs when a function of methylation changes at those loci (e.g. a linear model) crosses a threshold (first passage in stochastic process jargon).

    Omer Karin & Uri Alon have published a similar concept to explain how the decreased capacity of immune cells to remove senescent cells leads to a Gompertz-like law of longevity, something that originates from actuarial studies! Their model is simpler as they deal with a univariate problem [1].

    [1] https://www.nature.com/articles/s41467-019-13192-4

  • bbminner 5 hours ago

    Just in case you missed it, https://en.m.wikipedia.org/wiki/Survival_analysis exists to answer specifically this question.

    In more practical terms, if I were to approach this problem, I'd discretize it in time and apply classical ml to predict "chance to die during month X assuming you survived that long" and fit it to data - that'd be much easier to spot errors and potential issues with your data.

    I'd go for the stochastic calculus or actual survival analysis only if you wanted to prove/draw a connection between some pre-existing mathematical properly such as memory-less-ness and a physical/biological properly of a system such as behavior of certain proteins (that'd be insanely cool, but rather hard, esp if data is limited). In my (very vague) understanding, that's what finance papers that use stochastic analysis do - they make a mathematical assumption about some universal mathematical properly of a system (if markets were always near optimal with probability of deviation decaying as XYZ, the world economy would react this way to these things), and then prove that it actually fits the data.

    Happy to chat more, sounds like a fun project :)

  • joe_the_user 5 hours ago

    (Just spitballing)

    I think stochastic calculus looks at a system whose output value is a smooth/real value. Basically, it is for modeling systems like random walks where there is a little bit of random up-and-down jumping in each interval. However, if you are basically looking time versus dead-or-alive, your output is binary and time-of-death is really all the info you get and you wouldn't need/want a random walk model, just a more ordinary statistical model. Maybe if there was some other variable besides dead-or-alive you were measuring or aware of a stochastic model could help then (which is a bit like saying "if we had bacon, we could have bacon-and-eggs, if we had eggs").

    Also, if what you're saying is you have 50*X bytes of information that all influence life expectancy, it sounds like a challenging problem. But also it's kind of Taylor-made for neural networks; many discreet inputs versus a single smooth output. You might try a neural network and linear model and see how much better the neural network is - then you could determine if more complex-than-linear interactions were occurring.

  • seanhunter 5 hours ago

    Can’t speak about mice, but stochastic calculus is used in modelling for life insurance for humans I believe.

    eg https://www.soa.org/globalassets/assets/Files/static-pages/r...

eachro 4 hours ago

For those in quant finance, how much of this is useful in your day to day?

  • mamonster 4 hours ago

    Day to day not so much unless you are in structured products/exotics as a structurer, at which point yeah its pretty important.

    That said, already at masters level internships you could get asked much harder questions than what this article touches on. I got asked to prove the Cameron-Martin theorem once, I found that to be extremely difficult in a job interview setting.

dmvdoug 6 hours ago

Can someone please help me parse this sentence?

> Brownian motion and Itô calculare a notable example of fairly high-level mathematics that are applied to model the real world

What is “Itô calculare” supposed to have been? I am stumped. “Its calculation”?

whatshisface 6 hours ago

Here's my understanding of Ito calculus if it helps anyone:

1. The only random process we understand initially is Brownian motion.

2. Luckily, we can change coordinates.

  • max_ 3 hours ago

    Thanks, could you expand more on 2?

    • hrududuu 2 hours ago

      Ito's formula/lemma is like the chain rule from calculus. It is a generalization, in that it uses a second order Taylor series expansion, whereas the chain rule only needs a first order expansion. Anyway, I think (2) is a reflection of this fact, and how the chain rule lets us compute dynamics of a derived process.

      I sort of disagree with (1), since Ito's lemma is most naturally applied to ~martingales, of which Brownian Motion is an important special case.

EGreg 6 hours ago

I remember studying stochastiv calculus

And I remember noting that the standard deviation in regular statistics was that “quadratic variation” was slightly different than how variance is calculated. Off by one or squared or whatever. I made a note to eventually investigate why. Probably due to some stochastic volatility.

  • FabHK 6 hours ago

    There is the fact that the variance of the entire population is defined [0] as

      sum i=1..N (x_i - mu)^2 / N
    
    while, given a sample of n iid [1] samples from a distribution, the best [2] estimate of the distribution variance is

      sum i=1..n (x_i - a )^2 / (n-1)
    
    Note that we replaced the mean mu by the sample average a, [3] and divided by (n-1) instead of N.

    [0] with the mean mu := sum x_i / N being the actual mean of the population

    [1] independent and identically distributed

    [2] best in the sense of being unbiased. It's a tedious, but not very difficult calculation to confirm that the expectation of that second expression (with n-1) is the population variance.

    [3] with the sample average a := sum x_i / n being an estimate of the population mean

  • SeaGully 6 hours ago

    The other guy gives a solid explanation so don't use mine as a replacement or to assume the other is wrong.

    To me there are two ways to approach the problem I think you are thinking of (sample variance I think).

    (1) The sample variance depends on the sample mean which is sum(x_i) / n. Given the first n-1 of n samples, you would then know the final value (x_n = n * sample_mean - sum(x_i)_(n-1)) so at the very least n-1 could be understood as a "degrees of freedom". There are only n-1 degrees of freedom. Other higher sample moments can be roughly understood with the same degrees of freedom argument. This could be wrong though, it was just something I remember from somewhere.

    (2) The more mathematically inclined way is that biased_sample_variance = sum((x_i - sum(x_i) / n)^2) / n. The mean of the biased_sample_variance (across many iterations of a set of samples N), is not the population variance, but (n - 1) / n * population_variance (i.e. it is biased). So you multiply the biased_sample_variance by (n / (n - 1)) which gives the unbiased sample_variance equation: sum((x_i - sum(x_i) / n)^2) / (n - 1). The math is rather fun in my opinion, once you get into the swing of things.

    I sure do hope I understood your question correctly.

graycat 3 hours ago

Own favorite source on stochastic calculus:

     Eugene Wong,
     {\it Stochastic Processes in Information and
     Dynamical Systems,\/}
     McGraw-Hill,
     New York,
     1971.\ \
tsunego 6 hours ago

still wild to me that diffusion models are fast becoming the secret sauce behind ai image generation, but their roots are buried deep in stochastic calculus

who knew brownian motion would eventually help create cat memes?

bowsamic 6 hours ago

I had to study quantum stochastic calculus for my PhD. Really crazy because you get totally different results for the same mathematical expression compared to normal calculus

  • ta8645 5 hours ago

    Doesn't this mean that at least one of the results is wrong?

    • antognini 5 hours ago

      No, I think one of the fundamental insights of stochastic calculus is that the addition of noise to a process changes the trajectory in a non-trivial way.

      In finance, for instance, it leads to the concept of a "volatility tax." Naively, you might think that adding noise to the process shouldn't change the expected return, it would just add some noise to the overall return. But in fact adding volatility to the process has the effect of reducing the expected return compared to what you would have in the absence of volatility. (This is one of the applications of the result that the original article talks about in the Geometric Brownian Motion section.)

      • crdrost 3 hours ago

        Just to add to this, the reason that the things are different is, stochastics as a subject is trying to do calculus in the presence of noise, and what noise does is, it makes your function nondifferentiable. You would think that you cannot do calculus, without smooth curves! But you can, but we have to modify the chain rule and define exactly what we mean by integration etc.

        So the idea is “smooth curves do X, but non-smooth noisy curves do Υ(χ) where χ in some sense is the noise input into the system, and they aren't contradictory because Y(0) = X. (At least usually... I think chaos theory has some counterexamples where like the time t that you can predict a system’s results for, is, in the presence of exactly 0 noise, t=∞, but in the limit of nonzero noise going to zero, it's some finite t=T.)

    • bowsamic 5 hours ago

      Kinda. The differential operator in quantum Ito calculus can be applied to mathematical objects that the normal differentials aren’t properly defined on, such as stochastic variables.

ForceBru 6 hours ago

Seems like a great article. Having some prior experience with stochastic calculus, I think I understand almost everything here. Any other good introductory materials?

  • seanhunter 5 hours ago

    I’ve been planning to study this in a bit although I have some background to cover first so haven’t got on to it. From what I’ve found, the youtube channel “Mathematical Toolbox” has some videos which are quite introductory but seem good. Some people also recommend the book “An Informal Introduction to Stochastic Calculus with Applications” by Calin as a good place to start. Then Klebaner “Introduction to Stochastic Calculus with Applications” and also Evans “An Introduction to Stochastic Differential Equations” are apparently very good but harder and more formal texts, but you need some analysis and measure theoretic probability background first. The Evans is the same Evans who wrote the definitive book about PDEs fwiw. Klebaner and Evans are apparently a lot harder than Calin though even though they are all called introductions.