Counting occurrences of substring in string

Question

I need to know how many times a substring occurs in a given string. I figured that I might as well create an extension method:

public static int Occurences(this string str, string val)
{
    string copy = str;

    int instancesOf = 0;
    int indexOfVal;

    while ((indexOfVal = copy.IndexOf(val)) != -1)
    {
        copy = copy.Remove(indexOfVal, val.Count());
        instancesOf++;
    }

    return instancesOf;
}

I'm not sure that the assignment in the while loop is good practice. Should I change this to compute the value once outside the loop, and once in the loop, like this?

int indexOfVal = copy.IndexOf(val);

while (indexOfVal != -1)
{
    copy = copy.Remove(indexOfVal, val.Count());
    instancesOf++;
    indexOfVal = copy.IndexOf(val);
}

Any and all comments appreciated, the more the better.

Do you want to find overlapping instances? "bbb".Occurences("bb") returns 1. — mjolka, yesterday
@Hosch250 You need to document expected behavior. Should "bbb".Occurrences("bb") return 1 or 2? — nhgrif, yesterday
Either will be fine, but I might as well have it return 2 for the sake of a thorough implementation. — Hosch250, yesterday
If your val is long, and your str is pathological, the running time could be O(n^2). To avoid that, you could use a smarter string-searching algorithm, such as the Z Algorithm, which works in O(n) time. — 200_success♦, yesterday

7 revs · Accepted Answer · 2015-03-30 01:21:49Z

up vote 6 down vote accepted

Your current implementation could be more efficient. Copying the passed in string and chopping it up is probably pretty expensive (I'm not a master of .NET internals though).

val.Count()

Why not val.Length? Count() is very inefficient compared to Length.

copy = copy.Remove(indexOfVal, val.Count());

We don't need to remove part of the string, just search again starting at indexOfVal + val.Length.

Here is an example using an overload of String.IndexOf:

public static int Occurences(this string str, string val)
{  
    int occurrences = 0;
    int startingIndex = 0;

    while ((startingIndex = str.IndexOf(val, startingIndex)) >= 0) 
    {
        ++occurrences;
        ++startingIndex;
    }

    return occurrences;
}

This implementation will count overlapping occurrences (so "bbb".Occurences("bb") will return 2.

If you don't want to count overlapping occurences, you can replace ++startingIndex; with:

startingIndex += val.Length

On why an exception isn't thrown in the case of "foo".Occurrences("o"), from MSDN:

If startIndex equals the length of the string instance, the method returns -1.

edited yesterday

community wiki

7 revs
nhgrif

Calling .IndexOf() with a startingIndex greater than the length of the string crashes, but it actually works in this statement because the assignment fails. – Hosch250 yesterday

"probably pretty expensive" and "very inefficient" are not terms I would use to describe Count() and splitting strings. These are micro-optimizations and most of the time won't make a licking difference. Almost certainly, the implementation of Count() just returns Length for a string. – craftworkgames 2 hours ago

@craftworkgames It's a community wiki answer. You're welcome to improve it as you see fit. – nhgrif 2 hours ago

@craftworkgames Probably true, but using .Length is probably the more correct version, and my code needs to be fast as it will be running on mobile devices as well as desktops. – Hosch250 2 hours ago

add a comment |

asked	yesterday
viewed	153 times
active	yesterday

current community

your communities

more stack exchange communities

Counting occurrences of substring in string

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged c# strings loop or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Counting occurrences of substring in string

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c# strings loop or ask your own question.

Linked

Related

Hot Network Questions