I have a script to clear ~ 1000 web pages. I use Promise.all to share them, and it returns when all pages are done:
Promise.all(urls.map(url => scrap(url))) .then(results => console.log('all done!', results));
This is nice and correct, with one exception: the machine goes out of memory due to parallel requests. I use jsdom for recycling, it quickly takes up several GB of memory, which is understandable given that it creates hundreds of window .
I have an idea to fix it, but I donโt like it. That is, change the control flow so as not to use Promise.all, but combine my promises:
let results = {}; urls.reduce((prev, cur) => prev .then(() => scrap(cur)) .then(result => results[cur] = result) // ^ not so nice. , Promise.resolve()) .then(() => console.log('all done!', results));
This is not as good as Promise.all ... Ineffective because it is bound, and the return values โโmust be stored for later processing.
Any suggestions? Should I improve the control flow or improve the improved memory usage in scrap (), or is there a way to enable node throttle allocation?
source share