Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Let's say I have a plain text string that I want to split into a string array, using an integer array containing the indexes of this string where I want it to be cut.

Example:

Input string: "001ABCD2T"

Indexes array: {3, 7, 8}

Expected resulting string array: {"001", "ABCD", "2", "T"}

Here's how I do it for now:

    string line = "001ABCD2T";
    int[] markers = new int[] {3, 7, 8};

    for (int i = -1; i < markers.Length; i++)
    {
        string value = String.Empty;
        if (i == -1)
            value = line.Substring(0, markers[0]);
        else if (i == markers.Length - 1)
            value = line.Substring(markers[i], line.Length - markers[i]);
        else
            value = line.Substring(markers[i], markers[i + 1] - markers[i]);

        Console.WriteLine(value);
    }

The result is as expected, yet it is pretty eye-soring. Is there a cleaner way to write it, or even better, a already implemented method I didn't find?

share|improve this question

5 Answers 5

Disclaimer: I'm not actually very familiar with C#. I'll also refrain from the style review and let someone more familiar with common practices take a crack.

You can prepend 0 and append line.Length to your array, simplifying the logic. There's probably a better way, but I got it working with a List:

List<int> list = new List<int>();
list.Add(0);
list.AddRange(markers);
list.Add(line.Length);

for (int i = 0; i < list.Count - 1; i++)
{
    string value = line.Substring(list[i], list[i + 1] - list[i]);
    Console.WriteLine(value);
}
share|improve this answer
    
Eh, that's an idea to start with, I like that. I'm waiting a bit for some other answers before accepting, but it already simplifies the code a lot. –  Kilazur 23 hours ago
2  
@Kilazur No pressure to accept. In fact, accepting now may even discourage some people from writing more reviews, so by all means, leave it open! –  Schism 22 hours ago
    
No worries about that :p –  Kilazur 19 hours ago

I would write an extension method:

public static class StringExtensions
{
    public static IEnumerable<string> SplitByIndex(this string @string, params int[] indexes)
    {
        var previousIndex = 0;
        foreach (var index in indexes.OrderBy(i => i))
        {
            if (index > @string.Length)
            {
                break;
            }
            yield return @string.Substring(previousIndex, index - previousIndex);
            previousIndex = index;
        }

        yield return @string.Substring(previousIndex);
    }
}

And use it like this:

foreach (var part in "001ABCD2T".SplitByIndex(3, 7, 8))
{
    Console.WriteLine(part);
}

--EDIT--

Explanation. It's better to encapsulate this splitting algorithm in a method so we can re-use the same code later from another place. I like to use extension methods for general-puspose methods like this. I usually implement such functionality in some common assembly in a separate namespace (for example, MyCoolProject.Extensions.String). This way anyone who needs these extension methods can simply use the corresponding namespace in his code, and others will not be troubled by unnecessary methods.

As for the algorithm itself, it does not change our index array and it yields only one substring at a time. Plus, it's a bit more readable in my opinion because we do not use indexes to access index array, we're just iterating through it.

share|improve this answer
1  
alright, why would you do that? –  Malachi 17 hours ago
    
Given the context provided there is no reason to use an extension method. –  Assorted Trailmix 17 hours ago
1  
Would this answer constitute a reasonable checkin log message? From @200_Success's answer –  Malachi 16 hours ago
    
@Malachi Well, if the OP's code is written at one place and needs to be moved so it can be reused, it could be an useful answer I think. But by itself, no I don't think it is a reasonable check in log message. –  TopinFrassi 14 hours ago
    
Why the indexes.OrderBy(i => i)? With that I'd expect the following to result in exactly the same: "001ABCD2T".SplitByIndex(3, 7, 8), "001ABCD2T".SplitByIndex(8, 3, 7), "001ABCD2T".SplitByIndex(7, 8, 3); which is probably not how it should work. –  Corak 7 hours ago

Few helping variables and conditional operator:

string line = "001ABCD2T";
int[] markers = new int[] {3, 7, 8};

for (int i = -1, m = markers.Legth-1; i < markers.Length; i++)
{
    int from = i < 0 ? 0 : markers[i];
    int to = i == m ? line.Length : markers[i + 1];
    string value = line.Substring(from, to-from);

    Console.WriteLine(value);
}
share|improve this answer
    
Okay, what are we doing here and why? –  Malachi 17 hours ago
    
The code should be clear enough for what, if you understand ternary operator. Why? It extractes the proper indexes (from and to) to use single call to Substring. –  firda 16 hours ago
1  
Would this answer constitute a reasonable checkin log message? From @200_Success's answer –  Malachi 16 hours ago
    
Given the simplicity of the question, I don't see how you can even think about that meta-post. I would never think about changing such code, because it works and is not slow. So, in short, I really don't see your point. –  firda 16 hours ago
    
The quality of answers on our site matter. further discussion should be had in The Second Monitor –  Malachi 16 hours ago

You could do this without rebuilding the index array or any conditionals. Simply handle the first substring and the last substring outside of the loop:

string line = "001ABCD2T";
int[] markers = new int[] { 3, 7, 8 };
List<string> output = new List<string>{line.Substring(0, markers[0])};;
int i = 0;
int limit = markers.Length - 1;
for (; i < limit; i++)
{
    output.Add(line.Substring(markers[i], markers[i + 1] - markers[i]));
}
output.Add(line.Substring(markers[i], line.Length - markers[i]));
share|improve this answer

Assuming a forbidden token (cannot exist in the input string) is alright, here's a goofy idea idea using String.Split().

string line = "001ABCD2T";
int[] markers = { 3, 7, 8 };

const string tokenToSplitBy = "~";
int insertionCount = 0;

foreach (int index in markers)
    line = line.Insert(index + insertionCount++, tokenToSplitBy);

string[] resultArray = line.Split(new[]{tokenToSplitBy}, StringSplitOptions.RemoveEmptyEntries);

foreach (string result in resultArray)
    Console.WriteLine(result);
share|improve this answer
    
Fancy and simple, dunno why I didn't think about that. Yet the performance will drop if I have to work with thousands of lines with great length :p But clever overall. –  Kilazur 5 hours ago
    
Agreed ... but the "something cleaner" challenge part of the question sort of de-prioritized the performance angle a bit. –  shivsky 3 hours ago
    
Absolutely right, hence the appreciation; yet I hope you understand I will prefer an answer that fix my initial issue AND is optimization-friendly. –  Kilazur 3 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.