Use Case:
I consume REST Api, which gives the results of a battle in a video game. This is a team against a team online game, and each team consists of 3 players who can choose different from 100 different characters. I want to count the number of wins / losses and draws for each combination of teams. I get about 1000 battle results per second. I combine the character identifiers (ascending) of each team, and then I save the gains / losses and draws for each combination.
My current implementation:
const combinationStatsSchema: Schema = new Schema({ combination: { type: String, required: true, index: true }, gameType: { type: String, required: true, index: true }, wins: { type: Number, default: 0 }, draws: { type: Number, default: 0 }, losses: { type: Number, default: 0 }, totalGames: { type: Number, default: 0, index: true }, battleDate: { type: Date, index: true, required: true } });
For each log returned, I execute upsert and send these requests in bulk (5-30 lines) to MongoDB:
const filter: any = { combination: log.teamDeck, gameType, battleDate }; if (battleType === BattleType.PvP) { filter.arenaId = log.arena.id; } const update: {} = { $inc: { draws, losses, wins, totalGames: 1 } }; combiStatsBulk.find(filter).upsert().updateOne(update);
My problem:
So far I have only a few thousand entries in my combinationStats mongodb collection occupying only 0-2% of the processor. When a collection has several million documents (which happens quite quickly due to the number of possible combinations), MongoDB constantly takes 50-100%. Obviously, my approach does not scale at all.
My question is:
Any of these options may be the solution to my problem above:
- Can I optimize the performance of my MongoDB solution described above so that it does not take up so much CPU? (I have already indexed the fields into which I am filtering, and I do upserts in bulk). Would it help to create a hash (based on all the filter fields) that I could use to filter the data, and then to improve performance?
- Is there a better database / technology suitable for aggregating such data? I could introduce a couple more use cases when I need / need to increase the counter for a given identifier.
Edit: After khang commented that this could be related to performance improvements, I replaced $inc with $inc with $set , and indeed, the performance was equally “poor”. So I tried the proposed find() and then the manual update() approach, but the results did not get better.