Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I wrote the following code:

var xmlArray = from m in (from row in data select row.Mitarbeiter).Distinct()
               select "<Value Type='Text'>" + m + "</Value>";
var xml = string.Join("",xmlArray);

Then I noticed there are two iteration (two from) and also the Distinct() and I rewrote it using a foreach:

var mitarbeiter = new List<string>();
var xmlArray = new List<string>();
foreach(var row in data)
{
    if (!mitarbeiter.Contains(row.Mitarbeiter)){
        mitarbeiter.Add(row.Mitarbeiter);
        xmlArray.Add("<Value Type='Text'>" + row.Mitarbeiter + "</Value>");
    }
}
var xml = string.Join("",xmlArray);

Ma opinion is that the first example is cleaner...but what about efficiency/performance?

share|improve this question
1  
You will never notice the performance difference of that 2 solutions. Therefore use the one that is more readable. – JanDotNet May 31 at 19:04
    
LINQ is more readable however falls behind foreach when talking about performance because it uses foreach behind the scenes anyway. If performance bothers you, you should stick with the second variant, else go with LINQ. But sometimes it can loss even in readability.. – denis May 31 at 20:13
    
@200_success: I don't think the edited title is appropriate to the discussion. The topic is not regarding the Xml elements, but how to iterate the collections. – Emaborsa Jun 1 at 7:05
    
Change it again if you'd like, but keep in mind that the original title isn't quite in line with the site guidelines — see How to Ask and the rationale behind the rule. – 200_success Jun 1 at 7:10
up vote 10 down vote accepted

The best way to discuss performance is by measuring it. I suggest you use BenchmarkDotNet and never roll your own solution.

Onto your question: it is safe to assume that LINQ will in most cases be slower than the iterative approach. Not only does it allocate more (for example through lambdas) but it's also more expensive in runtime because of interface dispatching and other aspects.

There is obviously a balance to be struck between readability and performance -- if you have no performance issues then LINQ might very well be the best option although I would argue that in this case the iterative approach is very readable as well (if not more).

Something you can consider: use a HashSet<T> instead of a List<T> if your usecase allows it. Performing Contains() is O(n) for a list but O(1) for a Set.

share|improve this answer
    
Thanks for your hints. What about using a Dictionary instead two lists? – Emaborsa Jun 1 at 6:13
    
Why store the xml version in the first place if it's a constant based on mitarbeiter? – Jeroen Vannevel Jun 1 at 6:26
    
Because in a second moment I need them for a query and I can do string.Join("", mitarbeiter.Values) – Emaborsa Jun 1 at 6:31
    
That might be less lines of code but you're also creating more string objects your way (and thus a higher memory impact). If you omit that, you can just combine them with a stringbuilder and avoid creating the intermediate concatenation. – Jeroen Vannevel Jun 1 at 18:21

My opinion is that the first example is cleaner

I agree, but I find it hard to read a mixture of fluent syntax and query comprehension syntax.

I would be inclined to write:

var xmlArray = data
  .Select(row => row.Mitarbeiter)
  .Distinct()
  .Select(m => "<Value Type='Text'>" + m + "</Value>");
var xml = string.Join("",xmlArray);

Fun bonus: this trick does not work in your case because of the "Distinct". But in cases where you do not have that, you can rewrite any query of the form

from x in (from y in ...y-query...) 
...x-query...

as

from y in 
...y-query... 
into x 
...x-query...

This can be easier to read.

share|improve this answer
    
Indeed, forgot to mention that. As it think. it's better to use fluent syntax, it's much more readable and explicit. Only case where I use LINQ syntax, is joins, otherwise Chained methods are better. Sometimes it could be simple lst.Where(x=>x>10) instead of creepy from x in lst where x>10 select x – Bogdan Mart May 31 at 21:38
2  
@BogdanMart: I personally prefer the comprehension syntax, but I dearly wish a mechanism had been invented to move fluent syntax into comprehension syntax. Imagine if we could have from row in data select row.Whatever with Distinct() into m select ... It would be so nice. I regret that it didn't make it into the first version of LINQ. – Eric Lippert May 31 at 21:48

Linq is always slower than loops, but it's less error prone to copy-paste linq code with slight modifications. And it's in most times easier to write LINQ code (if that's not complex iterational algorithm). You need to ballance development time and maintainability vs performance, depending on your needs.

Main pros of LINQ is ease of write, and ease to reuse such code blocks, without possibility to forget renaming variables, for example xmlArray is referenced two times in second code block, so if you copy-paste this code, you should change both, that can produce bugs, if variable scopes intersect.

I've noticed one problem in your code -- "<Value Type='Text'>" + m + "</Value>" you should xml-escape that string variable(m).

Resume If you already wrote iterative solution, keep it, as it will perform faster, and, perhaps use StringBuilder instead of List<string> xmlArray.

P.S. @jeroen-vannevel thank's for pointing to BenchmarkDotNet, I was using LambdaMicrobenchmarking for such goals, but BenchmarkDotNet seems more decent, and feature full. I was searching for something similar to JMH in .Net land, and you found one! Will use it in my upcoming benchmarks ;)

But perhaps, for such an easy scenario LambdaMicrobenchmarking will suit better, less boilerplate code is required.

share|improve this answer
    
What do you mean with xml-escape that string? – Emaborsa Jun 1 at 6:11
    
@Emaborsa strign can contain characters like > < and other invalid for XML so you can use something like this msdn.microsoft.com/en-us/library/… or equivalent – Bogdan Mart Jun 1 at 15:40
    
Alternatively, he could use System.Xml.Linq.XElement : new XElement("Value", new XAttribute("Type", "Text"), m);. This would auto-encode m. XElement overrides .ToString(), so the string.Join call wouldn't have to change. – Dan Lyons Jun 1 at 17:53
    
Indeed, but that will create performance overhead, i guess. – Bogdan Mart Jun 1 at 21:00

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.