LENA partition sequence with LINQ

Question

LENA partition sequence with LINQ

I have the following extension method to find an element inside a sequence, and then return two IEnumerable<T> s: one containing all the elements before this element, and one containing the element, and all that follows. I would prefer the method to be lazy, but I did not understand how to do this. Can anyone come up with a solution?

 public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition) { var a = sequence.ToArray(); return new PartitionTuple<T> { Before = a.TakeWhile(v => !partition(v)), After = a.SkipWhile(v => !partition(v)) }; }

Executing sequence.ToArray() immediately defeats the laziness requirement. However, without this line, an expensive sequence iterator can be repeated twice. And, depending on what the calling code does, many more times.

+6

c # linq

moswald Nov 14 '13 at 15:48

source share

4 answers

You can use the Lazy object to ensure that the original sequence is not converted to an array until one of the two sections is repeated:

 public static PartitionTuple<T> Partition<T>( this IEnumerable<T> sequence, Func<T, bool> partition) { var lazy = new Lazy<IEnumerable<T>>(() => sequence.ToArray()); return new PartitionTuple<T> { Before = lazy.MapLazySequence(s => s.TakeWhile(v => !partition(v))), After = lazy.MapLazySequence(s => s.SkipWhile(v => !partition(v))) }; }

We will use this method to delay laziness evaluation until the sequence itself repeats:

 public static IEnumerable<TResult> MapLazySequence<TSource, TResult>( this Lazy<IEnumerable<TSource>> lazy, Func<IEnumerable<TSource>, IEnumerable<TResult>> filter) { foreach (var item in filter(lazy.Value)) yield return item; }

+4

Servy Nov 14 '13 at 16:30

source share

Here is a general solution that will memoize any IEnumerable<T> to ensure that it will only be repeated once, without causing it all to repeat:

 public class MemoizedEnumerable<T> : IEnumerable<T>, IDisposable { private readonly IEnumerator<T> _childEnumerator; private readonly List<T> _itemCache = new List<T>(); public MemoizedEnumerable(IEnumerable<T> enumerableToMemoize) { _childEnumerator = enumerableToMemoize.GetEnumerator(); } public IEnumerator<T> GetEnumerator() { return _itemCache.Concat(EnumerateOnce()).GetEnumerator(); } public void Dispose() { _childEnumerator.Dispose(); } private IEnumerable<T> EnumerateOnce() { while (_childEnumerator.MoveNext()) { _itemCache.Add(_childEnumerator.Current); yield return _childEnumerator.Current; } } IEnumerator IEnumerable.GetEnumerator() { return GetEnumerator(); } } public static class EnumerableExtensions { public static IEnumerable<T> Memoize<T>(this IEnumerable<T> enumerable) { return new MemoizedEnumerable<T>(enumerable); } }

To use it for your split problem, do the following:

 var memoized = sequence.Memoize(); return new PartitionTuple<T> { Before = memoized.TakeWhile(v => !partition(v)), After = memoized.SkipWhile(v => !partition(v)) };

This will only repeat the sequence no more than once.

+1

Jacob Nov 14 '13 at 16:36

source share

Typically, you simply return some object of your custom class that implements IEnumerable<T> , but also provides results only upon an enumeration request.

You can also implement IQueryable<T> (inherits IEnumerable) instead of IEnumerable<T> , but it is rather necessary to build access functionality with queries, such as the one that linq for sql provides: the database query is executed only on the final enumeration request.

0

Agat Nov 14 '13 at 16:25

source share

Anders abel · Accepted Answer · 2013-11-14T16:30:36+0000

This is an interesting problem, and to understand it, you need to know what “law” is. For the semantics of the operation, I believe that this definition makes sense:

The source sequence is only listed once, even if the resulting sequences are listed multiple times.
The original sequence is not listed until one of the results is listed.
Each of the results should be available for listing independently.
If the original sequence changes, it is undefined what will happen.

I'm not quite sure that I processed the corresponding object correctly, but I hope that you understand this idea. I set aside a lot of work on the PartitionTuple<T> class to be lazy.

 public class PartitionTuple<T> { IEnumerable<T> source; IList<T> before, after; Func<T, bool> partition; public PartitionTuple(IEnumerable<T> source, Func<T, bool> partition) { this.source = source; this.partition = partition; } private void EnsureMaterialized() { if(before == null) { before = new List<T>(); after = new List<T>(); using(var enumerator = source.GetEnumerator()) { while(enumerator.MoveNext() && !partition(enumerator.Current)) { before.Add(enumerator.Current); } while(!partition(enumerator.Current) && enumerator.MoveNext()); while(enumerator.MoveNext()) { after.Add(enumerator.Current); } } } } public IEnumerable<T> Before { get { EnsureMaterialized(); return before; } } public IEnumerable<T> After { get { EnsureMaterialized(); return after; } } } public static class Extensions { public static PartitionTuple<T> Partition<T>(this IEnumerable<T> sequence, Func<T, bool> partition) { return new PartitionTuple<T>(sequence, partition); } }

LENA partition sequence with LINQ

More articles: