For a given dataset in the data frame, when I apply the function describe, I get basic statistics that include min, max, 25%, 50%, etc.
For instance:
data_1 = pd.DataFrame({'One':[4,6,8,10]},columns=['One'])
data_1.describe()
Conclusion:
One
count 4.000000
mean 7.000000
std 2.581989
min 4.000000
25% 5.500000
50% 7.000000
75% 8.500000
max 10.000000
My question is : What is the mathematical formula for calculating 25%?
1) Based on what I know, this is:
formula = percentile * n (n is number of values)
In this case:
25/100 * 4 = 1
So, the first position is number 4, but according to the descriptive function it is 5.5.
2) Another example says: if you get an integer, then take the average value of 4 and 6, which will be equal to 5, it still does not match the 5.5one given by the description.
3) Another textbook says: you take the difference between the two numbers - multiply by 25% and add to the lower number:
25/100 * (6-4) = 1/4*2 = 0.5
: 4 + 0.5 = 4.5
- 5.5.
- ?