Java: the sum of two or more time series

I have several time series:

x | date | value | | 2017-01-01 | 1 | | 2017-01-05 | 4 | | ... | ... | y | date | value | | 2017-01-03 | 3 | | 2017-01-04 | 2 | | ... | ... | 

The disappointment in my dataset does not always match the date in both series. For scenarios in which one of them is missing, I want to use the last available date (or 0 if it is not). for example, for 2017-01-03 I would use y=3 and x=1 (from the date preceding) to get output = 3 + 1 = 4

I have every timer in the form:

 class Timeseries { List<Event> x = ...; } class Event { LocalDate date; Double value; } 

and read them in List<Timeseries> allSeries

I thought I could sum them using threads

 List<TimeSeries> allSeries = ... Map<LocalDate, Double> byDate = allSeries.stream() .flatMap(s -> s.getEvents().stream()) .collect(Collectors.groupingBy(Event::getDate,Collectors.summingDouble(Event::getValue))); 

But I would not have my missing date logic, which I mentioned above.

How else can I achieve this? (It should not be threads)

+5
source share
3 answers

I would say you need to extend the Timeseries class for the corresponding query function.

 class Timeseries { private SortedMap<LocalDate, Integer> eventValues = new TreeMap<>(); private List<Event> eventList; public Timeseries(List<Event> events) { events.forEach(e -> eventValue.put(e.getDate(), e.getValue()); eventList=new ArrayList(events); } public List<Event> getEvents() { return Collections.unmodifiableList(eventList); } public Integer getValueByDate(LocalDate date) { Integer value = eventValues.get(date); if (value == null) { // get values before the requested date SortedMap<LocalDate, Integer> head = eventValues.headMap(date); value = head.isEmpty() ? 0 // none before : head.get(head.lastKey()); // first before } return value; } } 

Then to drain

 Map<LocalDate, Integer> values = new TreeMap<>(); List<LocalDate> allDates = allSeries.stream().flatMap(s -> s.getEvents().getDate()) .distinct().collect(toList()); for (LocalDate date : allDates) { for (Timeseries series : allSeries) { values.merge(date, series.getValueByDate(date), Integer::ad); } } 

Edit: in fact, the NavigableMap interface NavigableMap even more useful in this case, it makes the missing data case

 Integer value = eventValues.get(date); if (value == null) { Entry<LocalDate, Integer> ceiling = eventValues.ceilingKey(date); value = ceiling != null ? eventValues.get(ceiling) : 0; } 
+3
source

One way to make the event comparable by date and use the TreeSets floor method:

 class Event implements Comparable<Event> { // ... @Override public int compareTo(Event o) { return date.compareTo(o.date); } } 

Then, in the Timeseries class, use TreeSet<Event> x instead of a list and enter it with an empty entry to make floor return it if there is no previous value:

 class Timeseries { public static final Event ZERO = new Event(LocalDate.of(1, 1, 1), 0d); TreeSet<Event> x = new TreeSet<>(Arrays.asList(ZERO)); // ... } 

Now collect all known events and calculate the sums:

  TreeSet<Event> events = allSeries.stream() .flatMap(s -> s.getEvents().stream()).collect(Collectors.toCollection(TreeSet::new)); Map<LocalDate, Double> sumsByDate = events.stream(). map(event -> new AbstractMap.SimpleEntry<>(event.getDate(), allSeries.stream().mapToDouble(a -> a.getEvents().floor(event).getValue()) .sum())). filter(p -> !p.getKey().equals(Timeseries.ZERO.getDate())). collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue)); 
+1
source

Therefore, I managed to do this partially with threads. It doesn't seem particularly efficient though, since you do a lot of re-sorting in the getRelevantValueFor method. I would prefer a more efficient solution.

 public Timeseries combine(List<Timeseries> allSeries) { // Get a unique set of all the dates accross all time series Set<LocalDate> allDates = allSeries.stream().flatMap(t -> t.get().stream()).map(Event::getDate).collect(Collectors.toSet()); Timeseries output = new Timeseries(); // For each date sum up the latest event in each timeseries allDates.forEach(date -> { double total = 0; for(Timeseries series : allSeries) { total += getRelevantValueFor(series, date).orElse(0.0); } output.add(new Event(date, total)); }); return output; } private Optional<Double> getRelevantValueFor(Timeseries series, LocalDate date) { return series.getEvents().stream().filter(event -> !event.getDate().isAfter(date)).max(ascendingOrder()).map(Event::getValue); } private Comparator<Event> ascendingOrder() { return (event1, event2) -> { long diff = event1.getDate().toEpochMilli() - event2.getDate().toEpochMilli(); if(diff>0) return 1; if(diff<0) return -1; return 0; }; } 
0
source

Source: https://habr.com/ru/post/1275737/


All Articles