Rub array sorting in batches

Sorry, if this was asked before, I'm not sure how to even look for it, and what I was looking for does not give any useful answer.

Here is my problem. I have a structure that basically controls the jobs that will be sent to the PBS cluster, and each job will need to be read from the input file. We are in the case when we have more than 5 thousand tasks that need to be launched, and there are parties, say ~ 30, which are read from different files, and the rest of them are read from a file that is read by another task.

This can be easily dealt with (although it is not the best decision to buy, maybe the fastest for the time that we have), having the ability to sort the list of tasks by identifier, which basically means which file it will read, i.e. I would like to sort an array like this

a = [1,1,1,2,2,2,3,3,3,4,4,4] 

in

 a = [1,2,3,4,1,2,3,4,1,2,3,4] 

Is there any way to achieve this ordering in a ruby? I might think about buying an algorithm, maybe it has already been done, and someone knows the answer.

Thanks!

+6
source share
3 answers

Decision

Thanks to @ sagarpandya82 for the original idea and @Cary Swoveland for finding bugs!

Or use 2 methods:

 def safe_transpose_and_flatten(array) l = array.map(&:length).max array.map{|e| e.values_at(0...l)}.transpose.flatten.compact end def sort_by_batches(array) safe_transpose_and_flatten(array.sort.group_by{|i| i}.values) end 

Or this one line (split into several lines for relative readability):

 def sort_by_batches(array) array.group_by{|i| i }.values # Chunks of equal values, .sort_by{|v| -v.size } # sorted by decreasing length, .reduce(&:zip) # transposed, .map{|r| r.flatten.compact.sort }.flatten # flattened and sorted end 

Example

 a = [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4] sort_by_batches(a) # => [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4] a = [1, 1, 3, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 1, 1] sort_by_batches(a) # => [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 1, 3] a = [1,2,2,3,3,3] sort_by_batches(a) # => [1, 2, 3, 2, 3, 3] 

Actions

Below are the steps for the second array:

 [1, 1, 3, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 1, 1] # input {1=>[1, 1, 1, 1], 3=>[3, 3, 3, 3], 2=>[2, 2, 2], 4=>[4, 4, 4], 5=>[5]} # group_by [[1, 1, 1, 1], [3, 3, 3, 3], [2, 2, 2], [4, 4, 4], [5]] # values [[1, 1, 1, 1], [3, 3, 3, 3], [2, 2, 2], [4, 4, 4], [5]] # sort_by -length [[[[[1, 3], 2], 4], 5], [[[[1, 3], 2], 4], nil], [[[[1, 3], 2], 4], nil], [[[[1, 3], nil], nil], nil]] # zip [[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4], [1, 3]] # map(&:flatten) and compact [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 1, 3] # flatten 

.reduce(&:zip).map(&:flatten).compact initially used as a supposedly safe transposition, but it did not work when the first array was smaller than the rest.

The first method uses this answer for transposition, single-line sorts arrays, reducing the length before using zip .

Job Class Application

Here is an example of the base class Job:

 class Job attr_reader :id def initialize(id) @id = id end def self.sort_by_batches(jobs) safe_transpose_and_flatten(jobs.sort_by{|j| j.id}.group_by{|j| j.id}.values) end def to_s "Job %d" % id end end jobs = [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4].map{|i| Job.new(i)} Job.sort_by_batches(jobs) 

It outputs:

 Job 1 Job 2 Job 3 Job 4 Job 1 Job 2 Job 3 Job 4 Job 1 Job 2 Job 3 Job 4 
+7
source

You can do this with the sort function:

 def collate(input) # Split the input array up into chunks of identical values # and sort the resulting groups. sets = input.group_by { |v| v }.values.sort_by(&:first) # Recombine these into a single output array by iterating over # each set and transposing values. Any nil values are scrubbed # with compact. (0...sets.map(&:length).max).flat_map do |i| sets.map do |s| s[i] end end.compact end 

You can see this work on some less simple data:

 input = [1,1,3,2,2,2,3,3,3,4,4,4,5,1,1] collate(input) # => [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4, 1, 3] 

Here 5 displayed only once.

+4
source

the code

 def doit(a) b = a.sort.slice_when { |x,y| x != y } b.max_by(&:size).size.times.flat_map { |i| b.each_with_object([]) { |c,arr| arr << c[i] unless c[i].nil? } } end 

Example

 doit [5, 1, 7, 2, 3, 3, 5, 2, 3, 1, 4] #=> [1, 2, 3, 4, 5, 7, 1, 2, 3, 5, 3] 

Explanation

In this example, the following steps.

 a = [5, 1, 7, 2, 3, 3, 5, 2, 3, 1, 4] c = a.sort #=> [1, 1, 2, 2, 3, 3, 3, 4, 5, 5, 7] b = c.slice_when { |x,y| x != y } #=> #<Enumerator: #<Enumerator::Generator:0x007fb8a99d94c8>:each> 

We can see the elements that are generated by the counter b (and passed to the block) by converting it into an array:

 b.to_a #=> [[1, 1], [2, 2], [3, 3, 3], [4], [5, 5], [7]] 

Continuation

 c = b.max_by(&:size) #=> [3, 3, 3] d = c.size #=> 3 e = d.times #=> #<Enumerator: 3:times> e.to_a #=> [0, 1, 2] e.flat_map { |i| b.each_with_object([]) { |c,arr| arr << c[i] unless c[i].nil? } } #=> [1, 2, 3, 4, 5, 7, 1, 2, 3, 5, 3] 

Here is the last operation with some included puts statements.

 3.times.flat_map do |i| puts "i=#{i}" b.each_with_object([]) do |c,arr| puts " c=#{c}, c[#{i}]=#{c[i]}, arr=#{arr}" arr << c[i] unless c[i].nil? puts " arr after arr << c[#{i}]=#{arr}" unless c[i].nil? end end 

 # i=0 # c=[1, 1], c[0]=1, arr=[] # arr after arr << c[0]=[1] # c=[2, 2], c[0]=2, arr=[1] # arr after arr << c[0]=[1, 2] # c=[3, 3, 3], c[0]=3, arr=[1, 2] # arr after arr << c[0]=[1, 2, 3] # c=[4], c[0]=4, arr=[1, 2, 3] # arr after arr << c[0]=[1, 2, 3, 4] # c=[5, 5], c[0]=5, arr=[1, 2, 3, 4] # arr after arr << c[0]=[1, 2, 3, 4, 5] # c=[7], c[0]=7, arr=[1, 2, 3, 4, 5] # arr after arr << c[0]=[1, 2, 3, 4, 5, 7] # i=1 # c=[1, 1], c[1]=1, arr=[] # arr after arr << c[1]=[1] # c=[2, 2], c[1]=2, arr=[1] # arr after arr << c[1]=[1, 2] # c=[3, 3, 3], c[1]=3, arr=[1, 2] # arr after arr << c[1]=[1, 2, 3] # c=[4], c[1]=, arr=[1, 2, 3] # c=[5, 5], c[1]=5, arr=[1, 2, 3] # arr after arr << c[1]=[1, 2, 3, 5] # c=[7], c[1]=, arr=[1, 2, 3, 5] # i=2 # c=[1, 1], c[2]=, arr=[] # c=[2, 2], c[2]=, arr=[] # c=[3, 3, 3], c[2]=3, arr=[] # arr after arr << c[2]=[3] # c=[4], c[2]=, arr=[3] # c=[5, 5], c[2]=, arr=[3] # c=[7], c[2]=, arr=[3] #=> [1, 2, 3, 4, 5, 7, 1, 2, 3, 5, 3] 
+4
source

Source: https://habr.com/ru/post/1013310/


All Articles