Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I've to do a generic method Classify<T> that, given a sequence of elements of type T and an arbitrary number of predicates, returns an array of n+1 lists (where n is the number of the predicates). More precisely, if the element satisfies the first predicate then it is added to the first list. If not, then if the element satisfies the second predicate it is added to the second list, etc. If none of the predicates can be applied to one element, then that element is added to the last list (the n+1 th).

I managed to do that with the following code. Is there any way to do it better, maybe with Linq?

public List<T>[] Classify<T>(IEnumerable<T> sequence, params Predicate<T>[] predicates)
{
    var result = new List<T>[predicates.Length+1];
    var counter = 0;    
    foreach (var elem in sequence)      
    {
        var valid = false;
        foreach (var predicate in predicates) 
        {
            if(result[counter]==null)
                result[counter] = new List<T>();
            if (predicate(elem) && valid==false)
            {
                var partition = result[counter];
                partition.Add(elem);
                valid = true;
            }
            counter++;
        }
        if (!valid)     
        {
            if (result[counter] == null)
                result[counter] = new List<T>();
            result[counter].Add(elem);
        }
        counter=0;
    }
    return result;
}
share|improve this question

3 Answers 3

up vote 4 down vote accepted
public List<T>[] Classify<T>(IEnumerable<T> sequence, params Predicate<T>[] predicates)
{
    var result = new List<T>[predicates.Length + 1];
    // It's clearer to create elements of the array explicitly
    for (int i = 0; i < result.Length; i++)
        result[i] = new List<T>();

    foreach (var elem in sequence)
    {
        var index = predicates.TakeWhile(p => !p(elem))
                              .Count();

        result[index].Add(elem);
    }
}

There is a few LINQ methods to get an index, f.e. Select((i, e) => ...). If the list of unsatisfied elements was the first (0th), I would use Select. But since it is the last, TakeWhile gives the shorter code.

share|improve this answer
    
Thanks. I have to practice a lot with linq, because it seems to simplify the code so much.. –  Sanci Jan 18 at 15:18
    
Also consider this for initializing the lists. var result = Enumerable.Range(0, predicates.Length + 1).Select(i => new List<T>()).ToArray(); –  Rotem Jan 18 at 15:25
    
@Rotem Thanks. I didn't know this method. –  Mark Shevchenko Jan 18 at 15:29

The idea is to group the elements in the sequence by the predicate they satisfy (line 10) and return an array (line 12) of lists containing the elements in each group (line 11).

This is one way to find the predicate index for the first predicate that is satisfied by an element (line 7): predicates.TakeWhile(p => !p(elem)).Count() ("stolen" from @Mark Shevchenko's answer). This has the advantage that it will return n+1 for elements which do not satisfy any predicate and the ordering is simpler.

After selecting the element and it's coresponding predicate index (lines 3-8), the predicates are ordered by the indexes (line 9).

[1]    public List<T>[] Classify<T>(IEnumerable<T> sequence, params Predicate<T>[] predicates)
[2]    {
[3]        return sequence
[4]            .Select(elem => new
[5]            {
[6]                Element = elem,
[7]                PredicateIndex = predicates.TakeWhile(p => !p(elem)).Count()
[8]            })
[9]            .OrderBy(x => x.PredicateIndex)
[10]           .GroupBy(x => x.PredicateIndex)
[11]           .Select(grp => grp.Select(x => x.Element).ToList())
[12]           .ToArray();
[13]   }
share|improve this answer
    
I think this won't work if there is no item that matches some predicate: there won't be a list for that predicate at all. –  svick Jan 20 at 14:36

There are many ways to strike the balance between LINQ and straight procedural code; the answer by Mark IMO strikes a very good balance between being succinct and easily understandable.

That said you can do this without any procedural code (just LINQ methods) and have the body of the classification method be a single expression. Extra care must be taken so that if there are bins which end up being empty of items the result still includes empty lists in the appropriate indexes (david's answer does not do this and as a result has a bug).

Here's a correct solution with pure LINQ, which I have to admit is more useful as an intellectual exercise rather than as a practical approach:

public List<T>[] Classify<T>(IEnumerable<T> sequence, params Predicate<T>[] predicates)
{
    return Enumerable
               .Range(0, predicates.Length + 1)
               .GroupJoin(
                   sequence.GroupBy(item => predicates.TakeWhile(p => !p(item)).Count()),
                   i => i,
                   g => g.Key,
                   (i, g) => g.SelectMany(g2 => g2).ToList()
               )
               .ToArray();
}
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.