Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I am using ExcelDataReader to import an Excel file to a dataset.

Example Excel table:

//ID     Name     Display Order    Active
//1      John          1             1

ID, DisplayOrder and Active columns are read as double, so I have to convert them to long, int and bool types respectively. I need to create a list of type Category from the DataTable of the DataSet.

Will this code perform well? Any suggestions for a faster conversion of DataTable to List of class?

var list = result.Tables["Categories"].AsEnumerable()
.Skip(1)
.Select(dr =>
        new Category
            {
                Id = Convert.ToInt64(dr.Field<double>("ID")),
                Name = dr.Field<string>("Name"),
                DisplayOrder = Convert.ToInt32(dr.Field<double>("Display Order")),
                IsActive= dr.Field<double>("Active") == 1 ? true : false
            }
        ).ToList();
share|improve this question
    
I'd probably rewrite that last line as IsActive= dr.Field<double>("Active") > 0 (your ternary is unnecessary as the comparison already evaluates to true or false) because of double rounding issues and equality comparisons. – Jesse C. Slicer Sep 3 '13 at 4:52
    
Thanks Jesse. Anything else to improve? – devspider Sep 3 '13 at 4:55
1  
I gotta say, this looks pretty tip-top and concise. – Jesse C. Slicer Sep 3 '13 at 5:05
    
You can access DataRow fields by name directly: dr["ID"], dr["Name"] etc. They're of type object, but the Convert.To____() functions handle that. – Bobson Sep 3 '13 at 18:38

I created an extension method for DataTable to convert them into a List<T>

public static class Helper
{
    /// <summary>
    /// Converts a DataTable to a list with generic objects
    /// </summary>
    /// <typeparam name="T">Generic object</typeparam>
    /// <param name="table">DataTable</param>
    /// <returns>List with generic objects</returns>
    public static List<T> DataTableToList<T>(this DataTable table) where T : class, new()
    {
        try
        {
            List<T> list = new List<T>();

            foreach (var row in table.AsEnumerable())
            {
                T obj = new T();

                foreach (var prop in obj.GetType().GetProperties())
                {
                    try
                    {
                        PropertyInfo propertyInfo = obj.GetType().GetProperty(prop.Name);
                        propertyInfo.SetValue(obj, Convert.ChangeType(row[prop.Name], propertyInfo.PropertyType), null);
                    }
                    catch
                    {
                        continue;
                    }
                }

                list.Add(obj);
            }

            return list;
        }
        catch
        {
            return null;
        }
    }
}

Example:

DataTable dtTable = GetEmployeeDataTable();
List<Employee> employeeList = dtTable.DataTableToList<Employee>();
share|improve this answer
2  
You are a lifesaver! I had no idea Convert.ChangeType even existed until today. This is much simpler than what I had been doing, trying to invoke TryParse on the destination type. And this also takes care of DBNull values in the DataTable. 10/10! – Krummelz Jan 5 '15 at 10:50
    
Interesting code, I am just wondering of the performance of this in contrast to doing non generic code? – Nap Jan 8 '15 at 7:55
2  
One update you may want to consider for your great answer is only update editable properties to reduce all the exceptions that get thrown. The following would update your code appropriately: foreach (var prop in obj.GetType().GetProperties()**.Where(p=>p.CanWrite)**) – user1189362 Apr 28 '15 at 21:54
2  
@Nap The performance of this code would be horrifically slow due to reflection. Elegant solution with terrible performance. – Zer0 May 9 '15 at 10:13
1  
Just a side note: I advise to rename the method to ToList instead of DataTableToList because it's already an extension method of DataTable. And it'll also match the standard convention of ToString, ToChar etc. – BornToCode Nov 26 '15 at 16:39

This "foreach" code performs faster:

private static void TestForEach(DataTable table)
{
    var categoryList = new List<Category>(table.Rows.Count);
    foreach (DataRow row in table.Rows)
    {
        var values = row.ItemArray;
        var category = new Category()
            {
                Id = values[0],
                Name = values[1],
                DisplayOrder = values[2],
                IsActive = (double)values[3] == 1 ? true : false
            };
        categoryList.Add(category);
    }
}

Than your initial code:

private static void TestMethodOriginal(DataTable table)
{
    var list = table.AsEnumerable()
        .Skip(1)
        .Select(dr =>
            new Category
                {
                    Id = Convert.ToInt64(dr.Field<double>("ID")),
                    Name = dr.Field<string>("Name"),
                    DisplayOrder =
                        Convert.ToInt32(dr.Field<double>("Display Order")),
                    IsActive =
                        dr.Field<double>("Active") == 1 ? true : false
                }).ToList();
}

Here is the test code:

public static void Main(string[] args)
{
    DataTable table = GetDataTable();
    var sw = new Stopwatch();

    sw.Start();
    TestMethodOriginal(table);
    sw.Stop();
    Console.WriteLine("Elapsed={0}", sw.ElapsedMilliseconds);

    sw.Reset();

    sw.Start();
    TestForEach(table);
    sw.Stop();
    Console.WriteLine("Elapsed={0}", sw.ElapsedMilliseconds);

    Console.ReadKey();
}

private static DataTable GetDataTable()
{
    var table = new DataTable();
    table.Columns.Add("ID", typeof(double));
    table.Columns.Add("Name", typeof(string));
    table.Columns.Add("Display Order", typeof(double));
    table.Columns.Add("Active", typeof(double));

    var rand = new Random();

    for (int i = 0; i < 100000; i++)
    {
        table.Rows.Add(i, "name" + i, rand.Next(0, i), rand.Next(0, 2));
    }
    return table;
}

Performanse time in milleseconds (LINQ vs ForEach loop):

  • 1000 iterations -> 6 ms vs 1 ms
  • 10000 -> 20 vs 5
  • 100000 -> 193 vs 58
  • 1000000 -> 1716 vs 1179
share|improve this answer
1  
Hi lharS, indeed your foreach is faster. Though I used the row.Field<T> to map each property. It's a little slower than using the itemArray but still much faster than the linq approach. – devspider Sep 3 '13 at 8:21
3  
Also note the code is not orthogonal - the foreach version is not skipping the header row as it should. – Jesse C. Slicer Sep 3 '13 at 12:26
    
@devspider, I shouldn't use term "linq" in my answer... i'l edit it. The performance time relates to your initial code what ever we call it... – IharS Sep 3 '13 at 13:20
    
I noticed the missing skip by @JesseC.Slicer too. I'll do some speed test of having condition to skip first row. – devspider Sep 4 '13 at 5:46
    
Additionally, the foreach variant is not performing type conversion for the ID, Name, or Display fields. Along those lines, I wonder if the original LINQ statement needs to be doing Field<double> inside Convert.ToX statements instead of Field<X>. – Dan Lyons Oct 3 '13 at 18:10

You can lose some of the reflection badness in Gaui's answer with a little bit of refactoring and a little bit of caching as such:

public static class Helper
{
    private static readonly IDictionary<Type, IEnumerable<PropertyInfo>> _Properties =
        new Dictionary<Type, IEnumerable<PropertyInfo>>();

    /// <summary>
    /// Converts a DataTable to a list with generic objects
    /// </summary>
    /// <typeparam name="T">Generic object</typeparam>
    /// <param name="table">DataTable</param>
    /// <returns>List with generic objects</returns>
    public static IEnumerable<T> DataTableToList<T>(this DataTable table) where T : class, new()
    {
        try
        {
            var objType = typeof(T);
            IEnumerable<PropertyInfo> properties;

            lock (_Properties)
            {
                if (!_Properties.TryGetValue(objType, out properties))
                {
                    properties = objType.GetProperties().Where(property => property.CanWrite);
                    _Properties.Add(objType, properties);
                }
            }

            var list = new List<T>(table.Rows.Count);

            foreach (var row in table.AsEnumerable().Skip(1))
            {
                var obj = new T();

                foreach (var prop in properties)
                {
                    try
                    {
                        prop.SetValue(obj, Convert.ChangeType(row[prop.Name], prop.PropertyType), null);
                    }
                    catch
                    {
                    }
                }

                list.Add(obj);
            }

            return list;
        }
        catch
        {
            return Enumerable.Empty<T>();
        }
    }
}
share|improve this answer
    
May want to check for a set method on the property or change to iterate over the columns in row.Table.Columns so that read-only properties are supported – moarboilerplate Aug 24 '15 at 20:45
    
@moarboilerplate easily added. – Jesse C. Slicer Aug 24 '15 at 20:57

protected by Jamal Apr 28 '15 at 22:46

Thank you for your interest in this question. Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).

Would you like to answer one of these unanswered questions instead?

Not the answer you're looking for? Browse other questions tagged or ask your own question.