Program slows down significantly based on input

Question

I've been writing a program to perform a kind of pattern-matching in XML and text files.

When my program reaches this section of the code, the CPU usage gets very high and the performance slows down to a point where the program seems to be frozen. It actually isn't, but depending on the input (number of text files and their content), it may take several hours to complete the task.

I'm looking for a more efficient way to rewrite this section of the code:

List<string> CandidatesRet = new List<string>();

for (int indexCi = 0; indexCi < Ci.Count - 1; indexCi++)
{
    // generate all sub itemset with length-1

    string[] allItems = Ci[indexCi].Split(new char[] { ' ' });
    for (int i = 0; i < allItems.Length; i++)
    {
        string tempStr = "";
        for (int j = 0; j < allItems.Length; j++)
            if (i != j)
                tempStr += allItems[j] + " ";
        tempStr = tempStr.Trim();
        subItemset.Add(tempStr);
    }


    // THE PROBLEM BEGINS HERE 

    foreach (string subitem in subItemset)
    {

        int iFirtS;
        for (int indexCommon = indexCi + 1; indexCommon < Ci.Count; indexCommon++)
            if ((iFirtS = Ci[indexCommon].IndexOf(subitem)) >= 0)
            {
                string[] listTempCi = Ci[indexCommon].Split(new char[] { ' ' });
                foreach (string itemCi in listTempCi)
                    if (!subitem.Contains(itemCi))
                        commonItem.Add(itemCi);
            }
        allCommonItems.Add(commonItem);
    }

    // generate condidate from common item
    foreach (string item in oldItemsetCi)
    {
        bool flagCi = true;
        foreach (List<string> listCommItem in allCommonItems)
            if (!listCommItem.Contains(item))
            {
                flagCi = false;
                break;
            }
        if (flagCi)
            CandidatesRet.Add((Ci[indexCi] + " " + item).Trim());
    }

There are many nested loops and I know this is the problem. What do you think should be improved?

Jesse C. Slicer · Accepted Answer · 2013-08-06 02:26:54Z

This should help a little bit:

        List<string> CandidatesRet = new List<string>();

        var splitChars = new[] { ' ' };

        for (var indexCi = 0; indexCi < Ci.Count - 1; indexCi++)
        {
            // generate all sub itemset with length-1
            var allItems = Ci[indexCi].Split(splitChars);

            subItemset
                .AddRange(allItems.Select((s, i) => allItems.Where((t, j) => i != j)
                .Aggregate(string.Empty, (current, t) => current + (t + " ")))
                .Select(tempStr => tempStr.Trim()));

            // THE PROBLEM BEGINS HERE 
            foreach (var subitem in subItemset)
            {
                for (var indexCommon = indexCi + 1; indexCommon < Ci.Count; indexCommon++)
                {
                    if (Ci[indexCommon].IndexOf(subitem, StringComparison.Ordinal) < 0)
                    {
                        continue;
                    }

                    var listTempCi = Ci[indexCommon].Split(splitChars);

                    commonItem.AddRange(listTempCi.Where(itemCi => !subitem.Contains(itemCi)));
                }

                allCommonItems.Add(commonItem);
            }

            // generate condidate from common item
            CandidatesRet.AddRange(oldItemsetCi
                .Select(item => new { item, flagCi = allCommonItems.All(listCommItem => listCommItem.Contains(item)) })
                .Where(t => t.flagCi)
                .Select(t => (Ci[indexCi] + " " + t.item).Trim()));
        }
    }

However, the big problem is the still the double loop, which will grow as a power-of-two rather than linearly with your data size.

Dan · Answer 2 · 2013-08-09 04:02:40Z

up vote 1 down vote

Use a StringBuilder instead of string concatenation. You will be surprised at how slow string concatenation is in bulk.

answered Aug 9 '13 at 4:02

Dan
111

add comment

asked	10 months ago
viewed	90 times
active	10 months ago

current community

your communities

more stack exchange communities

Program slows down significantly based on input

2 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged c# performance or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Program slows down significantly based on input

2 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c# performance or ask your own question.

Related

Hot Network Questions