You can do this the same way that it is implemented at the kernel level. You need to have a linked list of "timers" where each timer has a timeout compared to the previous one. It would be something like: Timer1: 500 ms from t0, Timer2: 400 ms from t0, timer3 1000 ms from t0.
Then you will have a linked list in which each item has a timeout compared to the previous one, for example:
head-> Timer2 (400ms) → Timer1 (100ms) → Timer 3 (500ms)
Each element contains a minimum: timerID, relative timeout, absolute init time (timestamp from an era). You can add a callback pointer to each timer.
You use your only timer and set the timeout to the relative timeout of the first item in the list: 400 ms (Timer2)
After the timeout, you will remove the first element, perhaps make a callback associated with Timer2, ideally this callback is made with a different workflow. Then you set a new timeout with the relative timeout of the next item, Timer1: 100ms.
Now that you need to create a new timer, say, 3000 ms, after 300 ms from t0, you need to insert it into the correct position by moving the linked list of timers. The relative timeout in Timer4 is 2300. This is calculated using (Timer2.RelativeTimeout - (now Timer2.AbsoluteTimeout)) and looks through the linked list to find the corresponding position, adding the relative timeouts of each previous element. Your linked list will be as follows:
head-> Timer2 (400ms) → Timer1 (100ms) → Timer 3 (500ms) → Timer4 (2300)
This way you implement many logical timers with one physical timer. Your timer creates and finds the O (n) time, but you can add various improvements for insert performance. Most importantly, the timers and updates are O (1). Delete complexity will be O (n) for timer search and O (1) for delete.
You must take care of the possible race conditions between the thread controlling the timer and the thread inserting or deleting the timer. One way to implement this timer in user space is through variable conditions and timeouts.