I would like to give you your help on the issue of reducing data in arrays in python, I am new to python, but I have some programming experience.
The problem is this: I have an array S of n elements that comes from the sensor’s measurements and approaches four other arrays that indicate the year, month, day and time of the measurements (y_lna, m_lna, d_lna AND h_lna), I also have another array of T of m equal elements, followed by 4 arrays (y, m, d, h), I want to create a vector of the same size as S, where values from T correspond to S values in hours, days, months and years.
The data are organized in such a way that they have values from year 0 to year n as follows:
Data h d m y
d1 00 1 1 2003
d2 03 1 1 2003
...
dn 10 5 8 2009
I created a function that allows you to do this, but I'm not sure if this is done correctly, it also takes a lot of time for the number of iterations that it performs, is there any way to do this more efficiently? and i don't know how to deal with nan values
def reduce_data(h, d, m, y, h_lna, d_lna, m_lna, y_lna, data):
year = np.linspace(2003, 2016, 14, True)
month = np.linspace(1, 12, 12, True)
new_data = []
for a in year:
ind1 = [i for i in range(len(y)) if y[i] == a]
ind1_l = [i for i in range(len(y_lna)) if y_lna[i] == a]
for b in range(len(month)):
ind2 = [i for i in ind1 if m[i] == b + 1]
ind2_l = [i for i in ind1_l if m_lna[i] == b + 1]
for c in range(len(ind2)): # days
ind3 = [i for i in ind2 if d[i] == c]
ind3_l = [i for i in ind2_l if d_lna[i] == c]
for dd in range(len(ind3)):
for e in range(len(ind3_l)):
if h[ind3[dd]] == h_lna[ind3_l[e]]:
new_data.append(data[ind3[dd]])
return new_data
I appreciate your cooperation
EDIT: I am adding data that I am working with, the sensor values are not real, I replaced them with random data, but the time values are real (only for one year). data1 contains sensor data S, the temporary variables of which are reference values to reduce, data2 contains sensor data T with its temporary variables, and finally, result is the one that has the expected results.
DATA 1
S h_lna d_lna m_lna y_lna
0 0 8 6 2 2003
1 2 9 6 2 2003
2 4 10 6 2 2003
3 6 11 6 2 2003
4 8 12 6 2 2003
5 10 13 6 2 2003
6 12 14 6 2 2003
7 14 15 6 2 2003
8 16 16 6 2 2003
9 18 17 6 2 2003
10 20 18 6 2 2003
DATA 2
T h d m y
0 864 0 6 2 2003
1 865 1 6 2 2003
2 866 2 6 2 2003
3 867 3 6 2 2003
4 868 4 6 2 2003
5 869 5 6 2 2003
6 870 6 6 2 2003
7 871 7 6 2 2003
8 872 8 6 2 2003
9 873 9 6 2 2003
10 874 10 6 2 2003
11 875 11 6 2 2003
12 876 12 6 2 2003
13 877 13 6 2 2003
14 878 14 6 2 2003
15 879 15 6 2 2003
16 880 16 6 2 2003
17 881 17 6 2 2003
18 882 18 6 2 2003
19 883 19 6 2 2003
20 884 20 6 2 2003
21 885 21 6 2 2003
22 886 22 6 2 2003
23 887 23 6 2 2003
24 888 0 7 2 2003
25 889 1 7 2 2003
26 890 2 7 2 2003
27 891 3 7 2 2003
28 892 4 7 2 2003
29 893 5 7 2 2003
30 894 6 7 2 2003
31 895 7 7 2 2003
32 896 8 7 2 2003
33 897 9 7 2 2003
34 898 10 7 2 2003
RESULT
result h_lna d_lna m_lna y_lna
0 872 8 6 2 2003
1 873 9 6 2 2003
2 874 10 6 2 2003
3 875 11 6 2 2003
4 876 12 6 2 2003
5 877 13 6 2 2003
6 878 14 6 2 2003
7 879 15 6 2 2003
8 880 16 6 2 2003
9 881 17 6 2 2003
10 882 18 6 2 2003