Johnson Curves

This tutorial written and reproduced with permission from Peter Ponzo

I’ve never been enthusiastic about the common assumptions that stock returns are distributed normally or lognormally or … whatever. For example, the normal and lognormal distributions look like Figure 1a. The normal density distribution is described by:




and the lognormal by:




Figure 1b gives a fit to S&P 500 returns.

Looks pretty good to me!

Just wait. Note that, in the lognormal distribution,

log(x/G) = log(x) – log(G)
= log(x) – m

hence involves a variable of the form: u = (something – m)/S.

So does that normal guy, right?

Yes. Pay attention.


Figure 1a


Johnson Curve
Figure 1b


The cumulative distribution (being the area beneath the density function f(x)), is always an increasing curve which varies from 0 to 1 (or 0% to 100%). So here’s an interesting idea:

Let K(u) be an increasing function of u.
Define z = A + B K(u).
Pick A and B so z varies between 0 and 1.
Set u = (x – m)/S
and you’ve got yourself a cumulative probability.


Here are a few increasing functions for z :


Johnson Curve
hyperbolic tangent


Johnson Curve
hyperbolic sine


Johnson Curve


Johnson Curve
inverse hyperbolic tangent


Johnson Curve
inverse hyperbolic sine


Johnson Curve
inverse tangent


Johnson Curve


Johnson Curve


Johnson Curve
inverse hyperbolic sine


… and a few curves of the form


Johnson Formula


That inverse hyperbolic sine … you’ve got it twice.

So I have. It’s a good one. Anyway, here are the curves:


Johnson Curve
using the hyperpolic tangent


Johnson Curve
using the inverse hyperpolic sine


Johnson Curve
using the inverse tangent


Are these useful?

I have no idea, but if you don’t like using a normal or lognormal you might like to use your own invention. In general, you have a bunch of parameters to select: A, B, m and s so you pick these to fit historical data.

What’s this Johnson stuff?

N.L. Johnson, in 1949, generated a bunch of probability distributions with several parameters (like our A, B, m and s) and …

And you want to generate a few yourself, eh?

Well, I’d like to describe the Johnson curves, but first we note the following:

If z is a normally distributed random variable, then it’s described by the function: e-z2/2
If z = (x-m)/s, this describes is a normal distribution for the random variable x, with mean m and standard deviation s.
If z = (log(x) – m)/s, this would give a lognormal distribution for x.
We note that (x-m)/s and (log(x) – m)/s are both increasing functuions of x.
If we introduce any increasing function we might expect to get some sort of distribution.

Johnson Distributions

We fiddle with a normal distribution to generate various other distributions


Yes, we stretch and slide and generally distort a normal distribution, like so:

First, consider a normal distribution, N, described by:

Johnson Formula


Then we introduce a magic function of our choosing (which we’ll call J):
z = A + B J(u) where u = (x-m)/s.

If we plot N(z) versus z, we’d get the familiar normal distribution, but if we plot N versus u we’d get something else … depending upon our choice of J. Now consider N versus x (where u is just the x-variable, moved left or right and scaled)


Here’s a few where we choose the increasing functions sinh(u), log(u/(1-u)) and log(u).


Johnson Curve


Johnson Curve


Johnson Curve


They’re increasing?

Yes. I showed them to you earlier. The Johnson-modified normal curves then look like this:


Johnson Curve
J(u) = inverse hyperpolic sine


Johnson Curve
J(u) = log(u/(1-u)) … which requires m < x < m + s


Johnson Curve
J(u) = log(u) … which requires x > m


Okay, guess what the charts would look like for J(u) = u?

Uh … I give up.


Johnson Curve
J(u) = u … that’s your standard, garden variety normal distribution

Note that, if

z = A + B J((x-m)/s) 

then, solving for x, we get:

x = m + s J-1((z-A)/Bs) 

where J-1 is the inverse of the J-function. For example, if J(u) = sinh(u), then we’d get:

z = A + B sinh((x-m)/s) and
x = m + s sinh-1((z-A)/B)

The problem, of course, if to identify the parameters A, B, m and s so that whatever curve we choose, it matches historical data as closely as possible. That usually means matching the first four moments.


We note that, if z is a random variable (running, say, from -infinity to infinity) with density distribution f(z), then the first four moments are given by:


Moment Moment Moment Moment


Is that hard?

We’ll see, but in the meantime, here’s a fit to 30 years S&P 500 monthly returns:


Johnson Curve


What’s so good about that?

Compare to a normal distribution fit:


Johnson Curve


See? The Johnson curve has them.