Close

Johnson Curves

This tutorial written and reproduced with permission from Peter Ponzo

I’ve never been enthusiastic about the common assumptions that stock returns are distributed normally or lognormally or … whatever. For example, the normal and lognormal distributions look like Figure 1a. The normal density distribution is described by:

 

[1]
Formula

 

and the lognormal by:

 

[2]
Formula

 

Figure 1b gives a fit to S&P 500 returns.

Looks pretty good to me!

Just wait. Note that, in the lognormal distribution,

log(x/G) = log(x) – log(G)
= log(x) – m

hence involves a variable of the form: u = (something – m)/S.

So does that normal guy, right?

Yes. Pay attention.

 

Normal
Figure 1a

 

Johnson Curve
Figure 1b

 

The cumulative distribution (being the area beneath the density function f(x)), is always an increasing curve which varies from 0 to 1 (or 0% to 100%). So here’s an interesting idea:

Let K(u) be an increasing function of u.
Define z = A + B K(u).
Pick A and B so z varies between 0 and 1.
Set u = (x – m)/S
and you’ve got yourself a cumulative probability.

Example?

Here are a few increasing functions for z :

 

Johnson Curve
hyperbolic tangent

 

Johnson Curve
hyperbolic sine

 

Johnson Curve
tangent

 

Johnson Curve
inverse hyperbolic tangent

 

Johnson Curve
inverse hyperbolic sine

 

Johnson Curve
inverse tangent

 

Johnson Curve
logarithm

 

Johnson Curve
log-ratio

 

Johnson Curve
inverse hyperbolic sine

 

… and a few curves of the form

 

Johnson Formula

 

That inverse hyperbolic sine … you’ve got it twice.

So I have. It’s a good one. Anyway, here are the curves:

 

Johnson Curve
using the hyperpolic tangent

 

Johnson Curve
using the inverse hyperpolic sine

 

Johnson Curve
using the inverse tangent

 

Are these useful?

I have no idea, but if you don’t like using a normal or lognormal you might like to use your own invention. In general, you have a bunch of parameters to select: A, B, m and s so you pick these to fit historical data.

What’s this Johnson stuff?

N.L. Johnson, in 1949, generated a bunch of probability distributions with several parameters (like our A, B, m and s) and …

And you want to generate a few yourself, eh?

Well, I’d like to describe the Johnson curves, but first we note the following:

If z is a normally distributed random variable, then it’s described by the function: e-z2/2
If z = (x-m)/s, this describes is a normal distribution for the random variable x, with mean m and standard deviation s.
If z = (log(x) – m)/s, this would give a lognormal distribution for x.
We note that (x-m)/s and (log(x) – m)/s are both increasing functuions of x.
If we introduce any increasing function we might expect to get some sort of distribution.

Johnson Distributions

We fiddle with a normal distribution to generate various other distributions

Fiddle?

Yes, we stretch and slide and generally distort a normal distribution, like so:

First, consider a normal distribution, N, described by:

Johnson Formula

 

Then we introduce a magic function of our choosing (which we’ll call J):
z = A + B J(u) where u = (x-m)/s.

If we plot N(z) versus z, we’d get the familiar normal distribution, but if we plot N versus u we’d get something else … depending upon our choice of J. Now consider N versus x (where u is just the x-variable, moved left or right and scaled)

Examples?

Here’s a few where we choose the increasing functions sinh(u), log(u/(1-u)) and log(u).

 

Johnson Curve

 

Johnson Curve

 

Johnson Curve

 

They’re increasing?

Yes. I showed them to you earlier. The Johnson-modified normal curves then look like this:

 

Johnson Curve
J(u) = inverse hyperpolic sine

 

Johnson Curve
J(u) = log(u/(1-u)) … which requires m < x < m + s

 

Johnson Curve
J(u) = log(u) … which requires x > m

 

Okay, guess what the charts would look like for J(u) = u?

Uh … I give up.

 

Johnson Curve
J(u) = u … that’s your standard, garden variety normal distribution

Note that, if

z = A + B J((x-m)/s) 

then, solving for x, we get:

x = m + s J-1((z-A)/Bs) 

where J-1 is the inverse of the J-function. For example, if J(u) = sinh(u), then we’d get:

z = A + B sinh((x-m)/s) and
x = m + s sinh-1((z-A)/B)

The problem, of course, if to identify the parameters A, B, m and s so that whatever curve we choose, it matches historical data as closely as possible. That usually means matching the first four moments.

Huh?

We note that, if z is a random variable (running, say, from -infinity to infinity) with density distribution f(z), then the first four moments are given by:

 

Moment Moment Moment Moment

 

Is that hard?

We’ll see, but in the meantime, here’s a fit to 30 years S&P 500 monthly returns:

 

Johnson Curve

 

What’s so good about that?

Compare to a normal distribution fit:

 

Johnson Curve

 

See? The Johnson curve has them.

Complicated!