I notice that my program has a strong memory leak (spiral memory consumption up). I had to parallelize this NLP task (using StanfordNLP EnglishPCFG Parser and Tregex Matcher). Therefore, I built an actor pipeline (a total of 6 actors for each task):
val listOfTregexActors = (0 to 5).map(m => system.actorOf(Props(new TregexActor(timer, filePrinter)), "TregexActor" + m))
val listOfParsers = (0 to 5).map(n => system.actorOf(Props(new ParserActor(timer, listOfTregexActors(n), lp)), "ParserActor" + n))
val listOfSentenceSplitters = (0 to 5).map(j => system.actorOf(Props(new SentenceSplitterActor(listOfParsers(j), timer)), "SplitActor" + j))
My actors are pretty standard. They need to stay alive in order to process all the information (there is no way of poison). Memory consumption is growing and increasing, and I have no idea what happened. If I work with a single thread, memory consumption will be just fine. I read somewhere that if the actors do not die, nothing inside will be released. Should I manually release things?
There are two heavy actors:
https://github.com/windweller/parallelAkka/blob/master/src/main/scala/blogParallel/ParserActor.scala
https://github.com/windweller/parallelAkka/blob/master/src/main/scala/blogParallel /TregexActor.scala
I wonder if this could be a Scala closure or another mechanism that stores too much information, and the GC cannot somehow collect it.
Here is the TregexActor part:
def receive = {
case Match(rows, sen) =>
println("Entering Pattern matching: " + rows(0))
val result = patternSearching(sen)
filePrinter ! Print(rows :+ sen.toString, result)
}
def patternSearching(tree: Tree):List[Array[Int]] = {
val statsFuture = search(patternFuture, tree)
val statsPast = search(patternsPast, tree)
List(statsFuture, statsPast)
}
def search(patterns: List[String], tree: Tree) = {
val stats = Array.fill[Int](patterns.size)(0)
for (i <- 0 to patterns.size - 1) {
val searchPattern = TregexPattern.compile(patterns(i))
val matcher = searchPattern.matcher(tree)
if (matcher.find()) {
stats(i) = stats(i) + 1
}
timer ! PatternAddOne
}
stats
}
Or, if my code is checked, can it be a StanfordNLP parser or a memory leak from a triangle counter? Is there a strategy for manually freeing memory, or do I need to kill those actors after a while and assign their mailbox tasks to a new actor to release the memory? (If so, how?)

After some struggle with the profiling tools, I was finally able to use VisualVM with IntelliJ. Here are the pictures. GC never started.

:

:
→ SentenceSplit (6) → (6) → - (6) → ()
Entry.scala: https://github.com/windweller/parallelAkka/blob/master/src/main/scala/blogParallel/Entry.scala