Generate random dates in a range in numpy

How can I generate random dates in a bimonthly date range in numpy ? One way I can imagine is to create two sets of random integer arrays:

 bimonthly1 = np.random.randint(1,15,12) bimonthly2 = np.random.randint(16,30,12) 

Then I can generate dates with day values ​​from the two arrays above for each month. However, this will require me to explicitly transmit data for the month and year. The solution would be to first generate the desired date_range and substitute the "days" in the range with the above array values. But for a large array, this may not be the best solution. This method will require operation on each element of the range.

I would appreciate any guidance on how to do this in numpy more efficiently.

+5
source share
4 answers

There is a much simpler way to achieve this, without having to explicitly access libraries outside of numpy.

Numpy has a datetime data type that is powerful enough: specifically for this case, you can add and subtract integers, and it treats it as the smallest unit of time available. for example, for the format% Y-% m-% d:

 exampledatetime1 = np.datetime64('2017-01-01') exampledatetime1 + 1 >> 2017-01-02 

however, for a% Y-% m-% d% H:% M:% S format:

 exampledatetime2 = np.datetime64('2017-01-01 00:00:00') exampledatetime2 + 1 >> 2017-01-01 00:00:01 

in this case, since you only have information before the resolution of the day, you can simply do the following:

 import numpy as np bimonthly_days = np.arange(0, 60) base_date = np.datetime64('2017-01-01') random_date = base_date + np.random.choice(bimonthly_days) 

or if you want to be cleaner:

 import numpy as np def random_date_generator(start_date, range_in_days): days_to_add = np.arange(0, range_in_days) random_date = np.datetime64(start_date) + np.random.choice(days_to_add) return random_date 

and then just use:

 yourdate = random_date_generator('2012-01-15', 60) 
+3
source

You can create a date range a priori, for example. using pandas date_range and convert it to a numpy array. Then make a random selection from this array of dates using numpy.random.choice .

+2
source

What if you define the start date as the first month and then add a random timedelta?

eg.

 import datetime d0 = datetime.datetime.strptime('01/01/2016', '%d/%m/%Y') from calendar import monthrange max_day = monthrange(d0.year, d0.month)[1] import numpy as np random_dates_1 = [] random_dates_2 = [] for i in range(10): random_dates_1.append( d0 + datetime.timedelta(days=np.random.randint(0, int(max_day/2))) ) random_dates_2.append( d0 + datetime.timedelta(days=np.random.randint(int(max_day/2), max_day+1)) ) 
0
source

Here is a clean numpy implementation that creates two datasets for each month of the year. The first array has random values ​​from the first half of each month, and the second from the second half of each month.

 import datetime from calendar import monthrange import numpy as np arr_first = np.array([]) arr_second = np.array([]) for i in range(1, 13): base = datetime.datetime(2016, i, 1) max_days = monthrange(2016, i)[1] first = np.random.randint(0, max_days // 2) second =np.random.randint(max_days // 2, max_days) arr_first = np.append(arr_first, base + datetime.timedelta(days=first)) arr_second = np.append(arr_second, base + datetime.timedelta(days=second)) 
0
source

Source: https://habr.com/ru/post/1260905/


All Articles