Hello!
I am trying to calculate the player chess players rating for several players in 6 different skills (C1, C2, ... C6). I have a huge dataframe (data) of games that look at it (head (data)). In this game, one person (user) chooses between two other people (p1 / p2) to win.
row.names user p1 p2 skill win looser time --------------------------------------------------------- 2 KE CL HK C1 CL HK 433508371 25 KE HK JT c1 HK JT 433508401 35 KE AB JT C1 AB JT 433508444 110 NF IP HE C1 HE IP 433508837 78 NF IP AS C1 AS IP 433508848 82 NF IT CV C1 CV IT 433508860
In another table (old_users), I track all the chess games of the players in 6 skills (head (old_users))
user C1 C2 C3 C4 C5 C6 1 BD 1200 1200 1200 1200 1200 1200 2 NF 1200 1200 1200 1200 1200 1200 3 CH 1200 1200 1200 1200 1200 1200 4 AR 1200 1200 1200 1200 1200 1200 5 AS 1200 1200 1200 1200 1200 1200 6 MS 1200 1200 1200 1200 1200 1200
Algorithm The algorithm passes through the data one row at a time per cycle, each time looking at the i-th row. The algorithm will look for p1 and p2 score data, extract the score of two players per skill. Then calculate their new score based on who wins or loses, and then updates the old_users cell with the corresponding new ratings.
What I need to do I need to do this as quickly as possible, and with the dataframe data, which are now 6000+ lines for a total of 24 players, this takes some time.
I tried using my current for loop, which gives the following points, which are too many.
user system elapsed 104.72 0.28 118.02
Questions
- Why is this algorithm taking so long? Are there any commands that are not well suited for loops, etc. Etc.?
- How can I achieve what I want faster?
Current for cycle
for (i in 1:dim(data)[1]) { tmp_data<-data[i,]
Data to play with
https://drive.google.com/file/d/0BxE_CHLUGoS0WlczUkxLM3VtVjQ/edit?usp=sharing
Any help is much appreciated
Thanks!
// HK