Lorentz curve graph

I need to get a graph of the Lorentz curve of the cumulative variable depending on the number of observations. I want both axes to be displayed on a percentage basis (for example, for example, the number of observations is the number of customers, and the variable y is the amount they bought, customers have already taken a place in descending order, I want to get a plot that says: "The top 10% of buyers purchased 90% of the total purchase amount.") My dataset is several million observations.

What is the best way to do this? Sub questions:

If I need to add two variables for the summary observation quantiles and the total $ bought (to use them to build), what is the object that returns the line number? I tried:

user_quantile <- row(df)/nrow(df)

but I get a matrix from the same columns (user_quantile.1, user_quantile.2), of which I need only one column.

Is there any way to skip adding percentages as variables and only have them for axis values?

The plot has many ways than I need to get the line. What is the best approach to minimize computational effort and get a good schedule?

Thank.

+3
source share
2 answers

You can check out RSeek's excellent search engine for content R. One quick query for the Lorentz curve (and the Lorentz curve) results in the following packages:

, , .

+8

, .

1) cut2() Hmisc, . , . cut() .

2) cut2() . table(). .

3) : Decile, % . 45- . % .

finaltable$cumulative_equality_line = seq(0.1, 1, by = 0.1)

4) ggplot2 . , 3 , , .

, . , .

!

0

Source: https://habr.com/ru/post/1748874/


All Articles