dplyr - ddply transformation (percentage change) in R -


i have data.frame looks this:

brand       year       eur brand1      2015       10 brand1      2016       20 brand2      2015       100 brand2      2016       500 brand3      2015       25 brand4      2015       455 ... 

also, attach code below:

library(plyr) library(dplyr) library(scales)  set.seed(1992) n=68  year <- sample(c("2015", "2016"), n, replace = true, prob = null) brand <- sample("brand", n, replace = true, prob = null) brand <- paste0(brand, sample(1:5, n, replace = true, prob = null)) eur <- abs(rnorm(n))*100000  df <- data.frame(year, brand, eur) 

i need additional data transformations (add more columns) future research.

firstly, calculate positions labels (for future chart) , call pos:

df.summary = df %>% group_by(brand, year) %>%    summarise(eur = sum(eur)) %>%   #   mutate( pos = cumsum(eur)-0.5*eur) 

what want is, calculate percentage grow each brand in terms of year. add line:

df.summary = ddply(df.summary, .(brand), transform,                 pchange = (sum(df.summary[df.summary$year == "2016",]$eur)/                          sum(df.summary[df.summary$year == "2015",]$eur) )-1                        ) 

however, constant size - growth of data frame.

could please me calculating percentage change each brand?

thanks!

also, easier if use lag:

df.summary %>% group_by(brand) %>%        mutate(pchange = (eur - lag(eur))/lag(eur) * 100)  # source: local data frame [10 x 5] #groups: brand [5] # #    brand   year      eur      pos   pchange #   <fctr> <fctr>    <dbl>    <dbl>     <dbl> #1  brand1   2015 637896.7 318948.3        na #2  brand1   2016 721944.2 998868.8  13.17573 #3  brand2   2015 708697.6 354348.8        na #4  brand2   2016 300541.1 858968.2 -57.59248 #5  brand3   2015 454890.1 227445.1        na #6  brand3   2016 576095.6 742937.9  26.64500 #7  brand4   2015 305712.0 152856.0        na #8  brand4   2016 174073.3 392748.6 -43.05970 #9  brand5   2015 589970.7 294985.3        na #10 brand5   2016 518510.2 849225.8 -12.11254 

as suggested @r2evans, if year not arranged beforehand,

df.summary %>% group_by(brand) %>% arrange(year) %>%           mutate(pchange = (eur - lag(eur))/lag(eur) * 100) 

Comments