LENA partition sequence with LINQ

I have the following extension method to find an element inside a sequence, and then return two IEnumerable<T> s: one containing all the elements before this element, and one containing the element, and all that follows. I would prefer the method to be lazy, but I did not understand how to do this. Can anyone come up with a solution?

 public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition) { var a = sequence.ToArray(); return new PartitionTuple<T> { Before = a.TakeWhile(v => !partition(v)), After = a.SkipWhile(v => !partition(v)) }; } 

Executing sequence.ToArray() immediately defeats the laziness requirement. However, without this line, an expensive sequence iterator can be repeated twice. And, depending on what the calling code does, many more times.

+6
source share
4 answers

This is an interesting problem, and to understand it, you need to know what β€œlaw” is. For the semantics of the operation, I believe that this definition makes sense:

  • The source sequence is only listed once, even if the resulting sequences are listed multiple times.
  • The original sequence is not listed until one of the results is listed.
  • Each of the results should be available for listing independently.
  • If the original sequence changes, it is undefined what will happen.

I'm not quite sure that I processed the corresponding object correctly, but I hope that you understand this idea. I set aside a lot of work on the PartitionTuple<T> class to be lazy.

 public class PartitionTuple<T> { IEnumerable<T> source; IList<T> before, after; Func<T, bool> partition; public PartitionTuple(IEnumerable<T> source, Func<T, bool> partition) { this.source = source; this.partition = partition; } private void EnsureMaterialized() { if(before == null) { before = new List<T>(); after = new List<T>(); using(var enumerator = source.GetEnumerator()) { while(enumerator.MoveNext() && !partition(enumerator.Current)) { before.Add(enumerator.Current); } while(!partition(enumerator.Current) && enumerator.MoveNext()); while(enumerator.MoveNext()) { after.Add(enumerator.Current); } } } } public IEnumerable<T> Before { get { EnsureMaterialized(); return before; } } public IEnumerable<T> After { get { EnsureMaterialized(); return after; } } } public static class Extensions { public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition) { return new PartitionTuple<T>(sequence, partition); } } 
+1
source

You can use the Lazy object to ensure that the original sequence is not converted to an array until one of the two sections is repeated:

 public static PartitionTuple<T> Partition<T>( this IEnumerable<T> sequence, Func<T, bool> partition) { var lazy = new Lazy<IEnumerable<T>>(() => sequence.ToArray()); return new PartitionTuple<T> { Before = lazy.MapLazySequence(s => s.TakeWhile(v => !partition(v))), After = lazy.MapLazySequence(s => s.SkipWhile(v => !partition(v))) }; } 

We will use this method to delay laziness evaluation until the sequence itself repeats:

 public static IEnumerable<TResult> MapLazySequence<TSource, TResult>( this Lazy<IEnumerable<TSource>> lazy, Func<IEnumerable<TSource>, IEnumerable<TResult>> filter) { foreach (var item in filter(lazy.Value)) yield return item; } 
+4
source

Here is a general solution that will memoize any IEnumerable<T> to ensure that it will only be repeated once, without causing it all to repeat:

 public class MemoizedEnumerable<T> : IEnumerable<T>, IDisposable { private readonly IEnumerator<T> _childEnumerator; private readonly List<T> _itemCache = new List<T>(); public MemoizedEnumerable(IEnumerable<T> enumerableToMemoize) { _childEnumerator = enumerableToMemoize.GetEnumerator(); } public IEnumerator<T> GetEnumerator() { return _itemCache.Concat(EnumerateOnce()).GetEnumerator(); } public void Dispose() { _childEnumerator.Dispose(); } private IEnumerable<T> EnumerateOnce() { while (_childEnumerator.MoveNext()) { _itemCache.Add(_childEnumerator.Current); yield return _childEnumerator.Current; } } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } public static class EnumerableExtensions { public static IEnumerable<T> Memoize<T>(this IEnumerable<T> enumerable) { return new MemoizedEnumerable<T>(enumerable); } } 

To use it for your split problem, do the following:

 var memoized = sequence.Memoize(); return new PartitionTuple<T> { Before = memoized.TakeWhile(v => !partition(v)), After = memoized.SkipWhile(v => !partition(v)) }; 

This will only repeat the sequence no more than once.

+1
source

Typically, you simply return some object of your custom class that implements IEnumerable<T> , but also provides results only upon an enumeration request.

You can also implement IQueryable<T> (inherits IEnumerable) instead of IEnumerable<T> , but it is rather necessary to build access functionality with queries, such as the one that linq for sql provides: the database query is executed only on the final enumeration request.

0
source

Source: https://habr.com/ru/post/958085/


All Articles