Monday, 3 September 2018

Statistical Functions - Central Tendency and Variation in R language

Descriptive statistics :-

First hand tools which gives first hand information.
  • Central tendency of data (Mean, median, mode, geometric mean, harmonic mean etc.)
  • Variation in data (variance, standard deviation, standard error, mean deviation etc.)
Central tendency of the data

Gives an idea about the mean value of the data 
The data is clustered around what value?

Data:  𝒳1, 𝒳2, ......,𝒳n
x : Data vector
mean (x)

 prod (x) ^ (1/length (x) )
(length (x)  is equal to the number of elements in x)


Median :-

     Value such that the number of observation above it is equal to the number of observation below it.
median (x)

Example :-



Variability

spread and scatterdness of data around any point, preferably the mean value.

Data set 1:  360, 370, 380
    mean = (360 + 370 + 380) /3  = 370
Data set 2:  10, 100, 1000
    mean = (10 + 100 + 1000) /3  = 370

How to differentiate between the two data sets?

  x : data vector
      var (x)
positive square root of variance : standard deviation
        sqrt (var (x) )

Variance
Another variant,

If we want divisor to be n, then use
   ( (n-1) /n) * var (x)
where  n = length (x)

Range:
    maximum(x1, x2, ....., xn) - minimum(x1, x2, ...., xn)
      max (x)  -  min (x)

Interquartile range:
  Third quartile (x1, x2, ..., xn) - First quartile (x1, x2, ...., xn)
     IQR (x)

Quartile deviation:
  [Third quartile (x1, x2, ..., xn) - First quartile (x1, x2, ..., xn)]/2
   =  Interquartile range/2
    IQR (x) /2


Example :-



0 Comments:

Post a Comment

Popular Posts

Categories

Android (21) AngularJS (1) Books (3) C (75) C++ (81) Data Strucures (4) Engineering (13) FPL (17) HTML&CSS (38) IS (25) Java (85) PHP (20) Python (83) R (68) Selenium Webdriver (2) Software (13) SQL (27)