I am working on a multi-threaded scraper for a website, and on another issue, I decided to use ThreadPool with QueueUserWorkItem ().
How can I constantly perform operations with parts without queuing in the queue? I need to queue> 300 thousand Elements (one for each user ID), and if I loop them in the queue, I will run out of memory.
So, I would like to:
// 1 = startUserID, 300000 = endUserID, 25 = MaxThreads
Scraper webScraper = new Scraper(1, 300000, 25);
webScraper.Start();
// return immediately while webScraper runs in the background
During this time, webScraper continuously adds all 300,000 work items as threads become available.
Here is what I still have:
public class Scraper
{
private int MaxUserID { get; set; }
private int MaxThreads { get; set; }
private static int CurrentUserID { get; set; }
private bool Running { get; set; }
private Parser StatsParser = new Parser();
public Scraper()
: this(0, Int32.MaxValue, 25)
{
}
public Scraper(int CurrentUserID, int MaxUserID, int MaxThreads)
{
this.CurrentUserID = CurrentUserID;
this.MaxUserID = MaxUserID;
this.MaxThreads = MaxThreads;
this.Running = false;
ThreadPool.SetMaxThreads(MaxThreads, MaxThreads);
}
public void Start()
{
int availableThreads;
while (Running)
{
}
}
public void Stop()
{
Running = false;
}
public static void process(object state)
{
var userID = Interlocked.Increment(ref CurrentUserID);
... Fetch Stats for userID
}
}
Is this the right approach?
- Start() ?