Technology and data work simultaneously, both incorporate deep science and deliver tremendous benefits. In technology, you can’t work forward if you don’t consider data. It might sound a bit weird, but data and technology are complementary to each other. There are big databases that provide users data and the outcomes based on it whenever demanded, isn’t it fascinating?
Let us take an example of a social media application where millions of users are registered, but when someone enters their login details, the app quickly recognizes who that person is and logs them in. This is nothing more than a complex server to server play for the storing of data.
Based on our browsing data, these social media applications show us advertisements that might interest us and also suggest we connect with people of the same skillset or interest.
This is a prime example of data being used and then converted into a meaningful outcome. Through the study of data, anyone can garner so many pieces of information related to it, signifying the role of data and the science behind its handling again and again.
Data Science & Technology
In the world of technology, where data is mingled with statistics, programs (algorithms), codes, and machines and then is operated upon with various operations science is always being used behind it.
This science, that teaches us to operate and study data and give meaningful pieces of information is called Data Science. Data science is used everywhere nowadays, it is used with all kinds of technologies and devices. Many apps and features use data science.
(Also check: Types of Data in Statistics)
For instance, suppose the finance industry or our share market is going through a decline. As this happens we see news popping up in newspapers and on TV sets about it. We see different people predicting different things.
All of us have seen the predictions on the share market about when it will rise and when it will crash, about which stock to invest in, and which share to sell. All of these predictions come from the study of data.
People who study the daily data of share markets and a particular stock can predict the next move of the market. This is the process of converting raw data through various operations, into a piece of meaningful information. And this particular thing comes from data science.
In this data science, there is a particular term that we need to know about. This term is called statistics.
What is Statistics?
Statistics as we know is a branch of mathematics. It deals with numeric data. It is the study of collected numeric data to provide meaningful information with it. It collects, analyses, interprets, and presents data in the best possible way to extract pieces of information through it.
Any kind of numerical data is studied in this branch of mathematics. From the sex ratio of a state to the literacy rate and population increase graph of a country, all come under this, all is managed through this.
As statistics is a mathematical branch, many mathematical processes come under this. The main process includes MEAN, MEDIAN, and MODE. Data is used to calculate various things and then is converted into information that can be read and talked about.
Apart from these 3 statistical processes, there is one more process, less talked about but it gathers much importance. That process is called standard deviation.
First of all, we need to know what standard deviation is.
What is Standard Deviation?
The phrase "standard deviation" refers to the amount of variability or dispersion around an average in statistics. It is a technical term for a measure of inconsistency. The difference between the actual and average value is known as dispersion or variance. The standard deviation increases as the dispersion or variance increases.
For example, if there is a share that is telling you that if you invest in it it will give a 25 percent return. You would gladly invest in it. But in the end, you get just 20 percent, and when you inquire about it you come to know that actual data has been hidden.
The actual return data ranges from -10 percent to 60 percent. But this is what they won’t tell you.
So, here 25 percent is the average data and 20 percent is the actual data and 5 percent is the dispersion. Here we can see our actual data deviated or different from the promised one or the average data. So, this is a case of standard deviation.
(Related blog: What is Pearson’s Correlation Coefficient ‘r’ in Statistics?)
Example of Standard Deviation
One more example is here to tell you all about standard deviation.
Suppose, a student has scored the following marks in all the subjects.
Now, we measure the average score of a student by adding up his marks scored in all the subjects and dividing it by the total no. of subjects.
The average score we get now is- 107/4 = 26.75
Now we subtract the average score from the scores of each subject-
Maths- 30-26.75= 3.35
English- 25-26.75= -1.75
Science- 27-26.75= 0.35
History- 25-26.75= -1.75
Now what we do is, square each one of them and calculate an average.
This average is called VARIANCE or DISPERSION.
The variance is then put under and square root and the result is obtained in the standard deviation.
Standard Deviation Formula
The formula of standard deviation is-
Here σ is the variance, μ is the average value and Xi is the raw value.
N here resembles no. of terms. This is called population standard deviation.
The formula changes a bit when we take a sample from a bigger population, for example, if a sample of 100 students is taken from all over the world then this will come under Sample Standard deviation.
In that case, the formula will be-
Advantages & Disadvantages of Standard Deviation
Standard deviation helps in the study of data and makes things easier, let us look at some of its advantages-
The amount of data that is clustered around a mean value is shown.
It provides a more precise picture of how data is disseminated.
Extreme values have less of an impact.
As all things have both pros and cons, Standard deviation too has its disadvantages, some of its disadvantages are:
It does not provide you with a complete range of data.
It is only used with data where an independent variable is plotted against the frequency of that variable.
Assumes that the data follows a normal distribution pattern.
Mostly used in the finance sector to calculate the risk, the standard deviation is a much popular method. Data with its uncertainty and inconsistency stops here and is thoroughly measured and operated upon to calculate its deviation from the mean value.
(Most related: Importance of Statistics in Data Science)
Standard deviation is a widely used method to calculate risk in finance and provide data with accuracy for better predictions. In the end, as we all know everything contains a bit of uncertainty, so does data, but it’s because of this that we can reduce this uncertainty and get accurate values.