How to find a sequence of elements from multiple arrays

I have a huge list array, each of which can have some repeating sequence, I need to find a sequence (more than one element repeating in different arrays)

In the example below, array 1 and array 2 have a sequence of elements 010,002,007 , as well as array 2 and array 3, having 011,345,547,800 as common.

array1: 090,010,002,007,310,104,048,610,720 array2: 456,010,002,007,087,011,345,547,800 array3: 004,089,870,011,345,547,800,001,002 

What is the best way to do this, I know that we can do this in simple C #, but it takes a lot of loops and programming. Is there any algorithm for identifying more than 1 elements sequentially encountered in any arrays>

+5
source share
3 answers

First of all, you should try to match your arrays as graphs. Define the following class

  private class Transition { public int ArrayNum { get; set; } public int? NextNum { get; set; } } 

Given our arrays

 var arrays = new[] { new[] {090, 010, 002, 007, 310, 104, 048, 610, 720}, new[] {456, 010, 002, 007, 087, 011, 345, 547, 800}, new[] {004, 089, 870, 011, 345, 547, 800, 001, 002} }; 

We will translate the transitions as follows:

 var transitions = new Dictionary<int, List<Transition>>(); for (var i = 0; i < arrays.Length; i++) { for (var j = 0; j < arrays[i].Length; j++) { var num = arrays[i][j]; var transition = new Transition { ArrayNum = i, NextNum = j < arrays[i].Length - 1 ? arrays[i][j + 1] : (int?)null }; if (!transitions.ContainsKey(num)) { transitions.Add(num, new List<Transition> {transition}); } else { transitions[num].Add(transition); } } } transitions = transitions.OrderBy(x => x.Key).ToDictionary(x => x.Key, x => x.Value); 

After matching, we will go through each array only once, find the corresponding sequences:

 private class Sequence { public int FirstArray { get; set; } public int SecondArray { get; set; } public IList<int> Nums { get; set; } } var sequences = new List<Sequence>(); for (var i = 0; i < arrays.Length-1; i++) { List<Sequence> relevantSequences = null; for (var j = 0; j < arrays[i].Length - 1; j++) { var num = arrays[i][j]; var nextNum = arrays[i][j+1]; var flows = transitions[num].Where(x => x.NextNum == nextNum && x.ArrayNum > i).ToList(); if (!flows.Any()) { if (relevantSequences != null) { sequences.AddRange(relevantSequences); relevantSequences = null; } } else { if (relevantSequences == null) { relevantSequences = flows.Select(x => new Sequence { FirstArray = i, SecondArray = x.ArrayNum, Nums = new List<int> {num, nextNum} }).ToList(); } else { foreach (var flow in flows) { var sequence = relevantSequences.SingleOrDefault(x => x.SecondArray == flow.ArrayNum); if (sequence != null) { sequence.Nums.Add(nextNum); } else { relevantSequences.Add(new Sequence { FirstArray = i, SecondArray = flow.ArrayNum, Nums = new List<int> { num, nextNum } }); } } } } } if (relevantSequences != null) sequences.AddRange(relevantSequences); } 

Hope you find this helpful.

0
source

If I understand the question correctly, you need to know if the 2+ sequence will be duplicated (maybe in a different order) in all arrays, for example:
array1: 000 , 110
array2: 111, 011 , 000
array3: 101,000
General sequence:
1) 000
2) 110 \ 011 \ 101

For a simple / short solution, I used an array of strings instead of int:

 static void Main(string[] args) { var arr1 = new[] { "000", "110" }; var arr2 = new[] { "111", "011", "000" }; var arr3 = new[] { "101", "000" }; var result = IsAllHaveMoreThen1SequnceDuplicate(new List<string[]> {arr1, arr2, arr3}); } private static bool IsAllHaveMoreThen1SequnceDuplicate(List<string[]> arrays) { if (arrays == null || arrays.Count == 0) { return false; } var firstArray = arrays[0]; for (var i = 1; i < arrays.Count; i++) { var isHaveMoreThen1SequnceDuplicate = IsHaveMoreThen1SequnceDuplicate(firstArray, arrays[i]); if (!isHaveMoreThen1SequnceDuplicate) { return false; } } return true; } private static bool IsHaveMoreThen1SequnceDuplicate(string[] arr1, string[] arr2) { var globalCounter = 0; foreach (var s1 in arr1) { foreach (var s2 in arr2) { var isSameSequence = IsSameSequence(s1, s2); if (isSameSequence) { globalCounter++; break; } } if (globalCounter == 2) { return true; } } return false; } private static bool IsSameSequence(string s1, string s2) { if (s1 == s2) { return true; } if (s1.Length != s2.Length) { return false; } var flags = new int[9]; for (int i = 0; i < s1.Length; i++) { var cInt = int.Parse(s1[i].ToString()); flags[cInt]++; } for (int i = 0; i < s2.Length; i++) { var cInt = int.Parse(s2[i].ToString()); flags[cInt]--; } return flags.All(f => f == 0); } 
0
source

Here is another solution. I made the assumption that the sequence does not have two numbers. The basic idea is that I first create all possible sequences and calculate their sums. Then I do a quick filtering based on the sums, and if they are the same, I do a more detailed check.

This code can be improved using more appropriate data structures. Some lists may be replaced by dictionaries, etc.

 static int[][] arrays = new[] { new[] {090, 010, 002, 007, 310, 104, 048, 610, 720}, new[] {456, 010, 002, 007, 087, 011, 345, 547, 800}, new[] {004, 089, 870, 011, 345, 547, 800, 001, 002} }; static void Main() { SequentedArray[] sequentedArrays = arrays.Select(array => new SequentedArray(array)).ToArray(); foreach (var baseSequentedArray in sequentedArrays) { foreach (var comparisonSequentedArray in sequentedArrays) { if(baseSequentedArray == comparisonSequentedArray) continue; var sameSequences = baseSequentedArray.FindSameSequences(comparisonSequentedArray); foreach (var sameSequence in sameSequences) { Console.WriteLine(String.Join(", ", sameSequence.GetValues(baseSequentedArray.Array))); } } } } class SequentedArray { public int[] Array { get; private set; } public List<Sequence> Sequences { get; private set; } public SequentedArray(int[] array) { Array = array; Sequences = new List<Sequence>(); for (int endIndex = 1; endIndex < array.Length; endIndex++) { int sum = array[endIndex]; for (int startIndex = endIndex - 1; startIndex >= 0; startIndex--) { sum += array[startIndex]; Sequences.Add(new Sequence(startIndex, endIndex, sum)); } } } public IEnumerable<Sequence> FindSameSequences(SequentedArray comparison) { var sameSequences = new List<Sequence>(); foreach (var sequence in Sequences) { foreach (var comparisonSequence in comparison.Sequences) { if (sequence.Sum == comparisonSequence.Sum && sequence.GetValues(Array).SequenceEqual(comparisonSequence.GetValues(comparison.Array))) { var smallerSequences = sameSequences.Where(existing => sequence.Covers(existing)).ToList(); foreach (var smallerSequence in smallerSequences) { sameSequences.Remove(smallerSequence); } sameSequences.Add(sequence); } } } return sameSequences; } } class Sequence { public int StartIndex { get; private set; } public int EndIndex { get; private set; } public int Sum { get; private set; } public Sequence(int startIndex, int endIndex, int sum) { StartIndex = startIndex; EndIndex = endIndex; Sum = sum; } public IEnumerable<int> GetValues(int[] array) { for (int i = StartIndex; i <= EndIndex; i++) { yield return array[i]; } } public bool Covers(Sequence comparison) { return StartIndex <= comparison.StartIndex && EndIndex >= comparison.EndIndex; } } 
0
source

Source: https://habr.com/ru/post/1276207/


All Articles