Standard Deviation Calculation for C# (and Other Languages) in Computer Programming

How to Find the Standard Deviation and Other Complex Math Functions

RH
For some reason, I have come across standard deviation calculations quite often in my programming. This is starting to become a rather common occurrence, and while it is a fairly straightforward calculation, it is rather difficult to get a computer to perform this in raw code. Complex equations need a little work for a computer to understand what we want it to do. Since there is no square root symbol on the basic keyboard or many of the other scientific calculation symbols, you will need to break it down to a more simple equation. There are a few different functions in the Math class, but not nearly as many as you may need. This means that you will have to get your hands a little dirty and break it down to simple addition, subtraction, multiplication, and division. Here is a look at what you will need to do using a standard deviation formula.

If you are anything like me, it was a miracle to pass Algebra. This makes it a little difficult to analyze equations. Since you are dealing exclusively with variables when writing a program, this can get even more complex. Before you can get started, you will have to familiarize yourself with what you need to do. By knowing how to break an algebraic or calculus function down to the basics of math, you can easily find the variables that you will need to program and how to get to the end result. In the end, all equations break down to the four basic math functions. Getting to those four is the hard part. For standard deviation the calculation is the absolute value of the difference between the average and the occurrence squared and then divided by the number of total occurrences. To finish it off, you have to get the square root of this value. This can be broken down easily with the help of the absolute value and square root functions of the Math class. Here is the basic code that I used for a standard deviation equation, followed by an explanation.

//Declare Variables

Decimal decDeviation = 0;

Decimal decAverage = 0;

Decimal decCurrentValue = 0;

Int intDeviationCount = 0;

Decimal decTotalDeviation = 0;

Double dblAverageDeviation = 0;

Double dblStandardDeviation = 0;

//Calculate deviation for each occurrence

For each [whatever in wherever you have the data stored]

{

decDeviation = Math.Abs(decAverage - decCurrentValue);

intDeviationCount = intDeviationCount + 1; (this can also be intDeviationCount ++)

decTotalDeviation = (decDeviation * decDeviation) + decTotalDeviation;

}

//Calculate Standard Deviation

If (intDeviationCount > 0)

{

dblAverageDeviation = Convert.ToDouble(decTotalDeviation / intDeviationCount);

dblStandardDeviation = Math.Sqrt(dblAverageDeviation);

}

This may sound and look a little intense, but it is rather simple. As you can see, I am a huge proponent of making code "Mickey Mouse" simple. Not only does this help developers that follow behind you to make better programs, but it also helps to keep you from being confused. Labeling variables in an understandable scheme and constant comments to separate you code can help to save you hours of pain in the debug and beta phases. It especially helps with those all night coding sessions that we all love. Simple code is less error prone whether you are fresh off a nap or starting that caffeine IV for the third time.

We will need to look at the original equation before breaking it down for an IDE.

SD = Square Root of [Total of all (Absolute Value of (average - current )) squared / occurrences]

Of course you cannot write it like this and there are no special keys that can reduce your basic math to this simple equation (I may say simple but I still don't understand it totally). So you will need to break it into multiplication and division. The first step is to get your deviations. You will need to total all of them together to figure your average deviation squared. This is the reason for the For Each statement. Once you have calculated your average, then you can go back through and get the deviation. Since the average will change as each occurrence gets totaled into it, you cannot do it at the same time as you get the average. This will not be accurate. You have to run this For Each after you have your total and final average value.

The first thing that you will do is call the Absolute Value function of the Math class. This is rather simple. It will give you the absolute value (difference from average or distance from zero with no negative values). This is the Math.Abs() statement. You can use this to find the absolute value of any number for all of your calculations. You can either enter a variable, an integer or decimal, or a simple math equation and it will return the absolute value as a decimal. Now you can take that value and add it to your deviation total. You will need to square this before adding it to the total though. You can use the Squared function in the Math class, but it is really simpler to just multiply the number by itself. You also need to advance the occurrence counter. You can use the =+ 1 or ++ methods, but I have always found it simpler to write it out. This allows me to see the code more clearly and I don't have to trust the machine to do it correctly when I can't see the proper equation. You can use whichever method you are more comfortable with; just don't forget to add this line. This one line can really throw off your calculations if it is done wrong or forgotten.

Now that you have the things that you need to calculate the SD, you can run the actual equation. I always put this after the For Each statement, simply because it is pointless to make your machine run the equation for each occurrence. It saves some valuable hard drive and memory space to run it once at the end. So after you close off the For Each, you can run the actual equation. The first thing you need to do is set up an If statement. This will remove the "division by zero" error from the picture. It is always possible for there to be a zero integer so you need to plan accordingly. Now you can find the average deviation. You will need to convert this to a double since the Square Root function only returns a double. You could either convert both the Total Deviation and Deviation Count variables to doubles or you can do it in one step as I have done in this example. Once you have the average deviation, you can now run the Square Root function to determine the standard deviation. Math.Sqrt() will take an equation (all variables need to be in double format), a variable, or an integer / decimal as an argument. It will then find the square root and return that value.

That is all there is to it. While the Math class is limited in its functions, it does offer some rather helpful tools for data analysis or even game design that relies on variations and deviations. No matter what your equations are, a little thought and planning with some help from the Math class can show you the answer. Just break each step down to the basic four math functions. Using the Math class can save you from having to find a simple way to calculate the square root. If you would rather write out the absolute value function, you can run an if statement to see if it is negative. If so, you can multiply by -1 to get a positive value. There are ways to do just about everything if you know how to think simply.

Published by RH

View profile

To comment, please sign in to your Yahoo! account, or sign up for a new account.