Linear correlation is a widely used to measure the strength of relation between two variables in Statistics. For this it uses a Correlation Coefficient. The Correlation Coefficient is basically a coefficient which shows the strength of association of data or flow of data between two variables. The linear correlation statistics is used to find the linear association between two or more variables and; for this it uses Pearson product moment correlation coefficient. In this whole paper correlation coefficient is represented by Pearson product moment correlation coefficient. A correlation coefficient is calculated for any sample of data is denoted as S (r), and the correlation coefficient calculated for a population can be represented by either symbol ‘ε’ or s(R).
To interpret a statistics Linear Correlation Coefficient sign and the absolute value is used which shows the direction of the relation as well as magnitude also. Here are some important points related to Linear Correlation Coefficient:
1. The value of correlation coefficient always lies between -1 to 1.
2. For a stronger linear correlation the absolute value must be greater.
3. The strongest and best correlation is shown by coefficient of 1 or -1.
4. For a weakest correlation the coefficient is always a zero.
5. Positive correlation means if we increase the value of one variable then other’s value also increases.
6. Negative correlation means if value of one variable increases then the other’s value decreases.
For showing the Linear Correlation Scatter Plots are used to represent the different Patterns of the degree of correlation between two variables. When the Slope of any line is given by a negative value that means that correlation is negative and similarly if Slope of the line is positive then the correlation is positive. When all the data points exist on an exact straight path or line that means there is a strongest correlation between variables. The Linear Correlation becomes weakest as the data points become scattered on the plot i.e. not in a Straight Line. If all the data points are randomly scattered (without any pattern) then correlation does not exist between variables. Outliers are the data points which do not fall on the straight path or line. These outliers affects the correlation means these outliers reduces the strength of correlation.
Formula to calculate the linear correlation coefficient of a sample:
S = ∑ (p*q) / √(∑p2) (∑q2),
Here ‘S’ is the sample of data and symbol ‘∑’ is for summation and ‘p’ is (pi - p) and similarly q is (qi - q); ‘p’ and ‘q’ are the Mean value of all the data points of the sample and pi, qi are the i th value of observation.
Formula to calculate the linear correlation coefficient of a population:
s or ε = [1/N] ∑ [(pi - µp) / σp] [(qi - µq) / σq],
Here total Numbers of observations are denoted by ‘N’ and ‘σp’ and ‘σq’ are Standard Deviation of ‘p’ and ‘q’ respectively. ‘∑’ is a symbol of summation and ‘pi’ and ‘qi’ represents the I th value of the observation and ‘µp’, ‘µq’ are population mean.