Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I need to know how many times a substring occurs in a given string. I figured that I might as well create an extension method:

public static int Occurences(this string str, string val)
{
    string copy = str;

    int instancesOf = 0;
    int indexOfVal;

    while ((indexOfVal = copy.IndexOf(val)) != -1)
    {
        copy = copy.Remove(indexOfVal, val.Count());
        instancesOf++;
    }

    return instancesOf;
}

I'm not sure that the assignment in the while loop is good practice. Should I change this to compute the value once outside the loop, and once in the loop, like this?

int indexOfVal = copy.IndexOf(val);

while (indexOfVal != -1)
{
    copy = copy.Remove(indexOfVal, val.Count());
    instancesOf++;
    indexOfVal = copy.IndexOf(val);
}

Any and all comments appreciated, the more the better.

share|improve this question
1  
Do you want to find overlapping instances? "bbb".Occurences("bb") returns 1. –  mjolka yesterday
    
@mjolka That might be nice, but is not necessary. –  Hosch250 yesterday
    
@Hosch250 You need to document expected behavior. Should "bbb".Occurrences("bb") return 1 or 2? –  nhgrif yesterday
    
Either will be fine, but I might as well have it return 2 for the sake of a thorough implementation. –  Hosch250 yesterday
    
If your val is long, and your str is pathological, the running time could be O(n^2). To avoid that, you could use a smarter string-searching algorithm, such as the Z Algorithm, which works in O(n) time. –  200_success yesterday

1 Answer 1

up vote 6 down vote accepted

Your current implementation could be more efficient. Copying the passed in string and chopping it up is probably pretty expensive (I'm not a master of .NET internals though).

val.Count()

Why not val.Length? Count() is very inefficient compared to Length.

copy = copy.Remove(indexOfVal, val.Count());

We don't need to remove part of the string, just search again starting at indexOfVal + val.Length.

Here is an example using an overload of String.IndexOf:

public static int Occurences(this string str, string val)
{  
    int occurrences = 0;
    int startingIndex = 0;

    while ((startingIndex = str.IndexOf(val, startingIndex)) >= 0) 
    {
        ++occurrences;
        ++startingIndex;
    }

    return occurrences;
}

This implementation will count overlapping occurrences (so "bbb".Occurences("bb") will return 2.

If you don't want to count overlapping occurences, you can replace ++startingIndex; with:

startingIndex += val.Length

On why an exception isn't thrown in the case of "foo".Occurrences("o"), from MSDN:

If startIndex equals the length of the string instance, the method returns -1.

share|improve this answer
    
Calling .IndexOf() with a startingIndex greater than the length of the string crashes, but it actually works in this statement because the assignment fails. –  Hosch250 yesterday
    
"probably pretty expensive" and "very inefficient" are not terms I would use to describe Count() and splitting strings. These are micro-optimizations and most of the time won't make a licking difference. Almost certainly, the implementation of Count() just returns Length for a string. –  craftworkgames 2 hours ago
    
@craftworkgames It's a community wiki answer. You're welcome to improve it as you see fit. –  nhgrif 2 hours ago
    
@craftworkgames Probably true, but using .Length is probably the more correct version, and my code needs to be fast as it will be running on mobile devices as well as desktops. –  Hosch250 2 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.