Feeds:
Posts
Comments

Archive for June, 2012

For anyone following this blog, you will probably be pleased that I have finished the chapters on Quant stuff for the CFA exam. I have mainly been banging out these posts as a way to solidify the content in my brain and to show the level of stuff I have to learn for the course.

It is worthwhile remembering here that the reason for taking the course is not totally about passing the exam. It is more to do with learning and understanding the content to make me a better investor and trader.

Now I have gone through the quant course, I am no longer mystified by the quant world and I have seen (almost daily) references to the material I am working on, in the news, in investment material I view and all over my broker account. I now understand so much more of what everyone is going on about.

It is like the cover has been taken off the investment world and now it is all falling into place and becoming commonplace.

The next section of the course is Economics and I am looking forward to this as I am mostly a global macro person and hopefully this section will enlighten me further in this respect.

 

Read Full Post »

The final chapter of the quant stuff in the CFA curriculum is finally here. This chapter is all about the method to test hypothesis.

It builds from the previous chapter on sampling and uses most of the same formulas.

There is a rather laborious approach to creating and testing hypothesis defined in this chapter, and it is one of those places where defining stuff down to the enth detail is both tedious and unnecessary in real life.

Never the less, here are the seven steps in creating and proving (or disproving) a hypothesis with a degree of confidence.

Step one: Stating the null hypothesis.

The null hypothesis is the thing you are trying to prove or disprove. I find the word null here confusing and annoying. It is one of those things you just have to accept when working with someone else’s theorem and vocabulary.

The null hypothesis is defined a H0 and is a statement about a population parameter. It is generally regarded as true and we then either reject the hypothesis or fail to reject it.

Alternative Hypothesis

If we have a hypothesis, we must have an alternative opposite hypothesis, so that between the two hypotheses they cover all possible values of the variable.

This is generally defined as Ha. We don’t need to test this as it is just the opposite of the H0.

Step two: Identifying the test statistic and type of distribution

The test statistic is a value calculated on the sample, whose value is the basis for deciding whether to reject the null hypothesis or not

The test statistic is defined as

Test statistic = (Sample statistic – value given in H0) / Standard Error of the sample statistic

(Recall standard error = σ / n ½)

which gives us:

Test statistic = (XRP – µ0) / σ / n ½

where XRP = the sample statistic and µ0 = the value given in H0

We now need to define the distribution. There are four distributions given by the CFA curriculum, these are:

  • t – distribution (t-test)
    For testing when the variance is not known
  • standard normal or z-distribution (z-test)
    The preferred method of testing, when the variance is known
  • chi-squared test (Χ­­2)
    Used when testing variance as a statistic. It is a family of asymmetric distributions bound by zero on the left.
  • F-distribution (F-test)
    Used when comparing variances from two populations that are independent, it represents the ratio of the sample variances and is used for testing equality (or inequality) of two variances. As with the chi-squared test it is asymmetric.
  • Nonparametric tests
    These are tests that are done when the underlying distribution is not normal

Step three: Specifying the significance level

The test of the hypothesis is in comparing the test statistic calculated in step 2 and the level of confidence or significance we are looking for in the result. This is given as a percentage of the total number of values falling inside of the curve of the sample.

Step four: Stating the decision rule

This is fairly simple, in as much as if the level calculated for the test statistic is outside the range of values given in the specified significance level we reject the null hypothesis.

Step five: collecting the data and calculating the test statistic

This step is essentially making the actual calculations given above

Step six: making decision of whether to reject the null hypothesis or not

This is just determining if the decision rule is breeched as defined in step 4. It is worth noting here that the curriculum defines two errors that can happen. These are called Type 1 and Type 2 errors. Type 1 is where we reject the hypothesis and it is actually true and Type 2 is where we do not reject the hypothesis but it is actually false.

As we decrease the likelihood of rejecting a true hypothesis, we are making it more likely that we might not reject a false hypothesis. Thus the probability is increased of a type 2 error, when decreasing a type 1 error. The only way to reduce the probability of both types of error is to increase the sample size.

Step seven: Making the economic or investment decision

This step just highlights that other considerations outside of the scope of the maths here should be taken into consideration, such as prevailing economic market conditions etc.

Comment

In my opinion, that lot should be 3 steps at most and I question the need to define it all so thoroughly. It is mostly common sense.

Nonparametric test – Spearman Rank Correlation Coefficient

This nonparametric ranking test is used to test the correlation of two variables when the assumptions required for using the correlation co-efficient described previously are not met.

These assumptions are mainly that the two variables are completely described by the normal distribution.

The Spearman Rank method is as follows:

  1. Rank observations from largest to smallest and assign the number 1 to n to each value. For any values which are the same, take the mean value of the number for each and assign this number to all the values which are the same. Do this for both values, let’s call them X and Y.
  2. Take away the ranking number for X from the ranking number for Y, for each value, i.e. X – Y.
  3. Use the Spearman Rank formula to calculate the correlation:

rs = 1 – (( 6 ∑ d­­2) / n(n2 – 1))

The value rs is the used as the rejection point to be compared against h0.

 

 

 

Read Full Post »

This section is all about choosing and working with different types of samples. It begins with a simple definition of a sample and then goes on to talk about stratified random sampling to produce a sample set which is more representative than one taken purely by simple sampling.

The stratified random sampling takes samples using certain divisions of interest. Then taking these samples, a simple sample selection takes place to produce a single sample with which to work with.

A key point in this chapter is the sampling error which is the difference between the statistics being measured in the sample to the actual statistics that would have been measured if the total population had been used.

Central Limit Theorem

The central limit theorem is used to make probability assumptions for non-normal distributions. It has three characteristics:

  1. The sample mean will be approximately normally distributed.
  2. The sample mean will be equal to the population mean (μ).
  3. The sample variance will be equal to the population variance (σ2) divided by the size of the sample (n).
    Sample variance = population variance / n

Standard Error

The standard error is the standard deviation of the sample statistic. This is the square root of the variance.

Standard error is the square root of the Sample variance

Error = σ / n ½

The standard error therefore becomes smaller as the sample size increases.

Estimating or Inferring values

When calculating random variables that are generated by a sample, there may be a sample error due to the sample not perfectly representing the underlying population. Therefore, with sampling, a confidence level should be given which gives the likelihood of the value being correct. This percentage is based on the number of outcomes in a normal distribution at that percentage from the mean.

The properties of the estimator are:

  • Unbiased – The expected value of the estimates sampling distribution is equal the underlying population parameter.
  • Efficiency – The sampling estimator with the smallest variance is the most efficient.
  • Consistency – Larger sample sizes tend to more accuracy in the estimate, i.e. the estimate tends towards the population value.

Confidence intervals

The confidence interval is constructed by the following formula:

Confidence = Point Estimate +/- reliability factor * standard error

Reliability factors

Known variance

The reliability factors are looked up in a z factor table. These are derived from a normal distribution curve.

This can be seen for the common parameters in this table:

Source: Investopedia

Unknown Variance

For distributions with unknown variance, the t-value can be used with a concept called degrees of freedom. The degree of freedom is n – 1 where n is the population size.

The reliability factor is then looked up in the chart using the degree of freedom and the desired confidence range.

When to use z or t values

Distribution Population Variance Sample Size Appropriate Statistic
Normal Known Small z
Normal Known Large z
Normal Unknown Small t
Normal Unknown Large t or z
Non-Normal Known Small unavailable
Non-Normal Known Large z
Non-Normal Unknown Small unavailable
Non-Normal Unknown Large t or z

Source: Investopedia

 

Read Full Post »

Everyone loves a good story. From being a baby to adults we are read, or read, stories of fiction, fact, news and gossip. Our fascination for a story is possibly more than meets the eye (or ear). This is something I have been thinking about over the last few weeks.

I am beginning to think the story is so important to us, that it we have assumed the story as part of our egos, the story has become the person and the person the story. The reality we believe we are in is nothing but a story. A damn intricate one at that. As Morpheus says in the matrix:

“Have you ever had a dream, Neo, that you were so sure was real?

What if you were unable to wake from that dream? How would you know the difference between the dream world and the real world? …

You’ve been living in a dream world, Neo”.

My analysis started with reading the Black Swan book by Taleb. He talks about what he calls the narrative bias. This is where we have a tendency to construct a story and then retro fit future facts into our story. He specifically talks about analysts retro fitting data into a normal distribution curve when in actuality it is not a normal distribution, thus drawing incorrect conclusions. This is done because the tools available are biased towards normal distribution encouraging analysts to use them.

The concept of a story however, goes much deeper and is way more interesting than normal distribution.

The story of the person

When you meet a new person, how do you know you can trust them?

We find out about their story; Who else do they know?, What is their history?

We construct a story around them and weave it into our own experiences to see if their story is compatible to our own. This is how we integrate new experiences and people into our lives. We take a part of the experience and weave it into our own story.

But what is the fabric of a story or even our own personal story?

An obvious example of a story is the CV (resume). A CV is a way of presenting a story. A good CV introduces the person and backs up this intro with a story with ‘the facts’ of where they have worked and what duties they performed. A CV provides the storyline and should invite the next employer to continue the story, to make the existing story part of their own.

Another example of a living story is Facebook. I am continually amazed at the amount of time people spend (waste) meticulously reconstructing their life story, presenting their life on a web page profile to others. The projection of the story into the ether, to intertwine with other peoples stories is the heart of Facebook. This intertwining of stories is where the fun and social interaction occurs.

This blog is a story. It has been deliberately constructed as one, with ‘the journey’ category being the entire blog listed in one category so the blog can be read like a story. It is my financial journey story.

Imagine not having a story. Imagine your friends not having a story. How would you plan your next action? Why be friends with one person over another? What purpose would you find for anything?

The story of the person weaves these things into a construct enabling us to function, make meaningful relationships, gives a sense of the passing of time and gives us a sense of purpose.

I therefore define the individual’s story as:

The story of the person is one of picking out seemingly pertinent items from the perceived past and weaving them together to form a short narrative to sum up the person for a given purpose.

The story of the world

How can we understand the world we live in, when there is so much going on at any one moment. The sheer volume of data is so overwhelming; no one person could possibly interpret it all. We need to pull out certain facts and construct a narrative to make some kind of sense. We can then weave this world story into our own to see how we fit into the bigger picture.

So the story of the world is similar to the story of the person. It is a selection of facts, sewn together to construct a simplified version of the whole.

The story of Data

I have just finished the chapter on constructing samples of population data to represent the whole, on the CFA course I am working on. Collecting samples of data is used when analysing the entire population of data is not possible, perhaps due to the data not being available or being too costly or time consuming to complete.

There are scientific methods used to construct samples so that they closely resemble the entire population of data they are trying to represent. One such method is to take many different small samples and to find their mean average. This hopefully reduces the sample error of a single result.

The sample process is analogous to constructing a story. The story is not the actual data but an approximation it.

How do we know if the story is sound? We can test our story with data not used in our samples to see if it holds and we can find supporting reasons for the story to be true. We can submit our findings for peer review and compare our results with other people’s findings. This is forward testing.

This holds true for the person and the world story as much as the data. We discuss the world and other people, comparing our views, drawing conclusions and always weaving the findings back into our own story, our view of the world.

The elements of a story

It seems to me then, given the above, that there are two key factors that play the most impactful part of our stories. These are the items that we choose to put into our story and the way in which we weave them together.

The way in which stories are weaved together I believe comes from our underlying psyche. A simple example is whether you are feeling in a good mood or feeling down. If you are feeling good today then the elements of the story are chosen in a positive fashion and the way in which they are woven together is to boost the positive feeling. It is an upwards cycle. Conversely, the opposite is true; if you are in a negative state of mind the story elements chosen may have a tendency to be negative also. This is a fairly gross example of the weaving process, but hopefully illustrates the way in which our psyche chooses the way we weave.

The elements are constructed partly by the filters our psyche have used and partly by what information is available.

It is therefore beneficial to make sure that a good and varied source of information is available to use in life to make the best possible stories.

The source of information

The source of information is important. By mixing with negative people, the source of information for our own stories becomes negative and this is what we weave into our lives. Likewise, if we read negative news, then our world view will be negative.

Sites like the BBC news continuously pump out extremely negative stories of murder, rape, torture and other crimes that go make up our view of the world. How many other things are going on right now, that are not reported that are amazing? How many acts of kindness go unreported? (Check it out, go to the BBC news site and count the number of negative stories they report on their home page compared to positive events)

If we read the news we should make sure we get it from as many diverse sources as possible.

Retrofitting the facts

The stories we construct into our lives are developed over years and years. We have a lot invested in our stories. Most often, we don’t want to have to rewrite our whole story. It takes time and effort and is much simpler to ignore small elements which may on closer inspection threaten the story. And so as Simon and Garfunkel write:

“All lies and jests
Still a man hears what he wants to hear
And disregards the rest”

So we retrofit the facts, choose what supports our stories and it is not until the story drifts so far from underlying facts we get a ‘correction’ back to reality.

And as Bill Hicks says:

“’Shut him up! We have a lot invested in this ride. SHUT HIM UP! Look at my furrows of worry. Look at my big bank account and family. This just has to be real.”

Aligning our stories with the underlying population (reality)

In financial analysis, as with the stories of our lives, these corrections can be painful. This is the black swan in Taleb’s book. This is the relationship that went wrong and caused pain and heartache, this is the European (and American and Asian and China) debt crisis, and this is all the places where the story didn’t match the underlying reality.

Why have stories?

Even if stories are the cause of corrections, there are obvious uses for the story. According to Taleb, stories are a form of compression; A way of storing large complex ideas or volumes into smaller more digestible forms.  I think this is true.

In NLP there is a saying ‘Don’t confuse the map for the terrain’. The digestible chunks we use to weave our world are all that our thinking brains can process. We must remember this is not the world. Only our story. Our story may contradict other people’s stories, but that is OK as they are just the stories, not the people themselves.

Protecting against the story

The story is a tool. A weapon. A useful construct. It is all these things but it is not real.

How do we penetrate the story and see the entire picture? Is this even possible?

For the past two years I have been practicing the art of Yoga meditation and intuition. By practicing methods of self-realisation whereby the stories fall away and a feeling of transcendental bliss is found, intuition springs forth and provides a new view of the world.

This view goes beyond the story. The view is not a mental one and has no need for compression as it does not need to fit into the mind to be comprehended by thought. The intuitive faculty of man is undeveloped and untapped. It is this we must turn to, to rise above the mundane.

Through meditative practice and the growth of intuition, we will find the stories drop away and the underlying reality will show through, it is this that guides us and focuses us and allows us to make the decisions we need to make and also gives us purpose beyond the world of the day to day.

The story has its place, but it is the domain of the mundane. To remove the story is my primary goal, both in my financial interpretation of the world and in my personal life.

Read Full Post »

Binomial Random Variables and Binomial Trees

Binomial variables are simply variables that can be true or false, i.e. binary. Their role in probability however is far more interesting. They can be used to determine simple outcomes such as success / failure, price up / price down, true / false.

When there are a large number of these outcomes happening for example a stock price changing over time, the binomial probability distribution formula can be used to determine on a set of data the probability of the number of true and the number of false occurrences.

If we have n number of occurrences and x is the number of times true occurred in that n set, we can work out the probability using the following formula:

P(x) = (n! / (n-x)!x!) * px * (1-p)n-x

Where p is the probability of getting a true value.

For a normal distribution this is 50%. (e.g. flipping a coin)

This can be mapped onto a binomial tree with time t at each node.

Binomial Tree, Source: Investopedia

Application

These trees are interesting to me, because they play a part in the pricing of American style options.

I am trying to build an option pricing application that compares prices using Black Scholes, Binomial distribution, trinomial distribution (up, down and stays still) and the actual current price of the option chain. This is outside of the curriculum for CFA level 1, but I see this being a key part of the trading engine for my own trades.

Continuous uniform distribution

The Continuous uniform distribution is where the probability is the same for every infinite point on a line. It is graphed as a straight line of probability against x. The density (pdf) is simply the area under the graph, of the probability multiplied by the values of x.

Multivariate distributions

When multiple normal distributions are combined, e.g. in a portfolio, we can calculate the total portfolio variance. The matrix of values to make up the portfolio variance is given by the variances of the distributions plus the sum of all the covariance values.

The number of covariance values is n(n-1) / 2, where n is the number of normal distributions in the portfolio.

Example:

UK Pound returns in percent

UK Bonds US Bonds German Bonds
Expected Return 0.029 0.021 0.073
Standard Deviation 0.409 0.606 0.635

Correlation Matrix

UK Bonds US Bonds German Bonds
UK Bonds 1 0.09 0.10
US Bonds 1 0.70
German Bonds 1

The covariance matrix of the above values is given by:

Covariance matrix

UK Bonds US Bonds German Bonds
UK Bonds 0.167281 0.022307 0.025972
US Bonds 0.022307 0.367236 0.269367
German Bonds 0.025972 0.269367 0.403225

Key

Variance
Covariance

Variances values are calculated by taking the square of the standard deviation.

Covariance is calculated by re-arranging the formula for Correlation and Covariance to

Covariance = Correlation * the product of the standard deviations

Example: US Bonds covariance with UK Bonds in the above table would be calculated by

0.09 * 0.409 * 0.606 = 0.022307

The log normal distribution

The log normal distribution is more relevant to options trading and is used because the gains and loss possible is not symmetrical. It is not possible for an asset to be worth a minus amount and therefore a log normal distribution is more appropriate model.

Calculating volatility

For options, the key ingredient in calculating the price is the volatility. The volatility is the standard deviation of the continually compounded returns of the underlying.

This is done by first calculating the return for each price interval, say a day. Each day is a separate time interval t, and the return is given by Rt,t+1 or the return percentage between price at time t and t +1. (St+1 / St)

To find the continuously compounded value, the equation:

rt,t+1 = ln(St+1 / St) = ln(1 + Rt,t+1)

This value would be calculated for every day in the population and then the standard deviation would be calculated on these values.

The next step is to multiply the result by the square root of 250 to annualize the result.

This is the volatility of the stock.

Value at Risk

The Value at Risk discussion in the CFA curriculum is very brief and I found not very helpful or easy to understand, there is a much better description on Investopedia.

Essentially VAR gives a probability based upon a normal distribution curve of the worst case scenario based upon historical values of the $ or % loss.

Monte Carlo Simulation

Value at Risk can be calculated by a Monte Carlo simulation. The simulation is basically a computer generated aggregation of possible walkthroughs using different values derived from the probability distribution.

The number of simulations, time period to run the simulation over and different confidence levels can be varied to achieve a set of VAR estimations.

I have to build one of these just for the hell of it. J

Read Full Post »

The next chapter is a continuation of evaluation by working with probability to determine the likeness of given outcomes. It covers the basics in probability mathematics.

Basics in probability

It introduces three different types of probability calculation:

  • Empirical probability – probability based on historical data
  • Subjective probability – where the analyst draws on their subjective opinion
  • A priori – where the probability is based upon logic or reasoning

Odds

If the probability is 20%, the Odds for it happening are 1 to 4. The Odds against it happening are 4 to 1.  As Han Solo says in the Star Wars: ‘Don’t tell me the odds’.

Joint Probability

Joint probability is where two events are not conditional on each other. They can happen simultaneously.

Conditional Probability

It introduces the concept of conditional probability where one event is conditional on another previous event.

This is given by the notation P(A | B) which means the probability of A given that B happens.

P(A | B) = P(AB) / P(B) where P(B) ≠ 0.

Where P(AB) is the probability of A and B occurring.

Multiplication rule

Re-arranging, we get the multiplication rule:

P(AB) = P(A | B) P(B)

Addition Rule

P(A or B) = P(A) + P(B) – P(AB)

Total probability rule

For calculating an event’s probability when you know the probability of that event happening combined with another event and the other event’s complement.

P(A) = P(A | S) * P(S) + P(A | SC) * P(SC)


Where P(A) is the event we want to know the probability of, P(S) is the second event and P(SC) is the second event’s compliment.

This is equivalent of the weighted average of the two combined events with the probability proportions of the second event.

Advanced probability

The next set of definitions and equations form the ‘advanced’ part of the CFA quant curriculum at this stage

Covariance

The covariance of two variables i and j is denoted by Cov(Ri, Rj). The covariance is a measure of the relationship between two random variables and shows how they change together.

If they change together then the covariance will be a positive number, however, if they change separately, (e.g. the large values of one correspond to small values in the other) the covariance is negative. If the covariance is zero then there is no relationship.

Cov(Ri, Rj) = E[ (Ri – ERi)(Rj – ERj)]

The magnitude of the covariance is not that easy to interpret. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation.

Correlation

The correlation of two variables is another way of showing the relationship of the two variables and is the covariance divided by the product of the two variables standard deviation.

This equation is highly likely to feature in the exam. With the correlation data and standard deviation it is possible to work out the co-variance.

Bayes Formula

Bayes Formula is used for adjusting probabilities we know about with new information, thus updating the probability to take into account the new probability.

Updated probability = (Probability of new information given event / Unconditional Probability of event) * Prior probability

P(Event | Information) = (P(Information | Event) / P(Information)) * P(Event)

Principles of counting

The principles of counting are short cuts to finding the number of permutations that can exist in a large set of data

Combination Notation (Binomial Formula)

Combination Notation is used when choosing r objects from a set of n objects, when the order in which the r objects does not matter

nCr = n! / (n – r)!r!

Permutation Notation

Permutation Notation is used when the order does matter.

nPr = n!/(n-r)!

Read Full Post »

The next section in the CFA curriculum is all about summarizing data through frequency distributions. This is fascinating stuff as I am also reading the Black Swan book by Nassim Taleb in which he slates this method of analysis for financial data.

Still, it is vital to understand this stuff, even if it is to get at the stuff which is missing in other analyst’s work.

There is also a great page on this on the investopedia website.

The first categorization of data is into different types of measurement scales:

Nominal

Categorise the data but not rank it, such as style of investment.

Ordinal

A scale using some system of ranking, but where the differences between the ranks are not always clear and there is no scale between ranks

Interval

The intervals between the ranks are equal but the scales have no zero point

Ratio

The scale is the same as interval but has a zero point

—–

Frequency Distribution

The next section in this chapter is about establishing a definition of frequency distribution and defines a way of creating intervals as categories to divide up a large data set, and then counting the number of observations in each category.

The next step is to calculate the cumulative absolute and relative frequency in a grid.

Example (some rows deleted for brevity):

Quarterly return interval Number of observations (absolute frequency) Relative frequency Cumulative absolute frequency Cumulative relative frequency
-15% to -10% 2 5.0% 2 5.0%
-10% to -5% 1 2.5% 3 7.5%
-5% to 0% 5 12.5% 8 20.0%
0% to +5% 17 42.5% 25 62.5%
+5% to +10% 10 25.0% 35 87.5%

Source: Investopedia

Averages

The next part is all about finding the central value and the curriculum gives the following methods to do this:

Arithmetic mean

The arithmetic mean is all the values added together and divided by the number of items. This has limitations if there are outliers or extreme values. In the curriculum it states analysts sometimes ignore these extreme values by removing the highest a lowest values. This is crazy to me. This is exactly what Taleb talks about and how Black Swans can be created. It is that outlier that comes and bites you when you have long forgotten you removed it from the data.

Median

This is the middle value or an arithmetic mean of the two middle values if there are two.

Mode

The Mode average is the value which occurs the most times.

Geometric mean

Good for time based averages as it has the similar effect as compounding. This was described in the post on Discounted Cash Flow Applications.

Harmonic mean

The harmonic mean is used mostly in dollar cost averaging.

Harmonic mean is computed by the following steps:

1. Taking the reciprocal of each observation, or 1/X
2. Adding these terms together
3. Averaging the sum by dividing by n, or the total number of observations
4. Taking the reciprocal of this result

Quartiles and Percentiles

These can be defined as the number of observations that lie below a certain percentage of the data. This can be calculated as:

Ly = (n + 1) * (y / 100)

where Ly is the location of the value in a sorted list, n is the number of items in the list and y is the percentile you are trying to locate.

If the value calculated by this equation is not an integer, the position of the percentile can be calculated by taking the decimal part and multiplying it by the difference between the two values (above and below the position).

Measures of dispersion

Range

The simplest measure of dispersion is the range, ie the difference between the smallest and largest value.

Mean Absolute Deviation

After this is the Mean Absolute Deviation (MAD).

This is calculated by subtracting the mean value from each value, and aggregating these and dividing by the number of values.

Where n is the number of items, xi is the value of each item and µ is the mean.

Variance and Standard Deviation

Variance () is very similar except it uses the square of the difference between each value and the mean, thus eliminating the negative values. Variance is whatever unit squared. The variance is really useful in as much as it is the standard deviation squared.

where  is the variance,  is the value,  is the mean and n is the number of items.

The standard deviation is the square root of the variance.

I never really quite understood standard deviation before this and now I see it is obvious. It’s just a way of quantifying the deviation from the mean or central value. Or in other words how likely is a value to appear from the central value. It is a way of measuring risk.

(For the black swan readers, this is the crux of Taleb’s disgruntlement; this is great in mediocristan but doesn’t work in extremistan, i.e. in the stock market!) I’ll write about this when I have finished his book.

Sample Variance and Deviation

The sample variance and corresponding deviation is when you can’t use all the population data because it is too large or unavailable. By taking a smaller set of data, you can use almost the same formula as the Standard Variance except the denominator is n – 1.

This correction is called Bessel’s Correction.

Semivariance and Target Semivariance

Semivariance is looking only at the downside risk. This is the same as variance except that only the values below the mean are used in the calculation. If the distribution of the data is symmetric with no skew the Semivariance and variance will be the same. However, if the skew is negative, the Semivariance will be higher than the variance. This highlights the downside risk and ignores the positive deviations.

Chebyshev’s Inequality

This is a great little calculation for seeing the percentage of observations that the number of standard deviations away from the mean contain.

The equations is

I = 1 – 1 / k2, for all k > 1


# of Standard Deviations from Mean (k) Chebyshev’s Inequality % of Observations
2 1 – 1/(2)2, or 1 – 1/4, or 3/4 75 (.75)
3 1 – 1/(3)2, or 1 – 1/9, or 8/9 89 (.8889)
4 1 – 1/(4)2, or 1 – 1/16, or 15/16 94 (.9375)

Source: Investopedia

Coefficient of Variation

The standard deviation is just a number relative only to the specific data set. In comparing deviations in different data sets or to gauge a high or low dispersion relative to some other frame of reference, the standard deviation is not very useful.

The coefficient of variation
(CV) helps because the CV measures the amount of risk (standard deviation) per unit of mean return.

The CV has no unit, and can be used to compare data in different units.

CV = s / X where s is the standard deviation and X is the sample mean.

Sharpe Ratio (inverse of CR with the addition of the risk free rate)

The CV is used as a measure of relative dispersion. It inverse, can be used as a measure of return per unit of risk, i.e. the risk reward ratio. I.e. X / s.

Adding in the concept of a risk free percentage return, the Sharpe Ratio is given as:

Sh  = (Rp – Rf) / sp

Where Rp is themean return of the portfolio, Rf is the risk free return (example treasury bond), and  sp is the standard deviation of return.

The greater a portfolio’s Sharpe ratio, the better its risk-adjusted performance has been. A negative Sharpe ratio indicates that a risk-less asset would perform better than the security being analysed.

Symmetry, Skew and Kurtosis

Since I have been learning this stuff, I have seen Skew and Kurtosis crop up everywhere. Excel even has functions for this in the standard drop downs. I even have a button for it on my calculator.

In a nutshell, Skew and Kurtosis are ways of mathematically describing uneven risk about the mean values. They are the last two of the four mathematical moments.

Skewness

Skew details whether the values are unevenly distributed to negative or positive of the mean. If the skew of a set of data is less than zero it is negatively skewed (long left tail), which indicates a larger chance of extremely negative values and frequent positive gains, and if skew is greater than zero is positively skewed and there is greater change of extremely positive values with frequent small loses.

Skew is mathematically defined as the averaged cubed deviation from the mean divided by the standard deviation cubed.

A good exam tip from Investopedia:

Positive: Mean > Median > Mode
Negative: Mean < Median < Mode

Notice that by alphabetical listing, its mean, median, mode. For positive skew, they are separated with a greater than sign, for negative, less than.

Kurtosis

Kurtosis details how close together (peaked) or distributed around the mean the values are. If the values are clustered closely around the mean, the graph will look more peaked, this is called Leptokurtic, if they are more spread out, Platykurtic.

For a normal distribution, the kurtosis value is equal to 3. So excess kurtosis is used, which is kurtosis – 3.

A leptokurtic distribution is more peaked with fatter tails and a positive (greater than 3) kurtosis value, means there are more frequent occurrences of extreme values.

Its mathematical definition is the average deviations to the fourth power divided by the standard deviation to the fourth power for large data sets.

Read Full Post »

Follow

Get every new post delivered to your Inbox.

Join 38 other followers