Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have a two dimensional string array look like this: enter image description here
The first column contains characters of many strings, other columns are extra data for character.
I want to search a string (maybe change to array character) in this array to get all match indexes (start - end). For example, when I search with key "next", the result should be [5 - 8], [13 - 16] (the highlight parts in image above).
Shortly, I need a method look like this:

  public static List<Interval> search(String searchText, String[][] data, int columnsCount, int rowCount){
      // Convert search text to String array
      String[] searchArr = getStringArray(searchText);
      // then search in data

  }

  // where Interval is:
  public class Interval{
       public int start;
       public int end;
  }   

Is there any fast way to search like this,cause my data is very large?
Thanks in advance!

share|improve this question
 
One of the most effective search is BinarySearch. Red Black Tree or AVL Tree can be implemented for more effective search. –  erencan 26 mins ago
1  
There are a whole bunch of String searching algorithms. –  Domi 25 mins ago
 
Also, note that if your array contains [letter1, data, data, data..., letter2, data...], you will get a bad cache hit rate, which is essential for performance when working with large data sets. Try to re-arrange your data as [letter1, letter2, ... letterN, data, data, ...]. Here is why. (don't mind that he is talking about C++, this applies to all languages). –  Domi 23 mins ago
add comment

3 Answers

I would recommend to adapt the String[][] to a CharSequence. Then you can use a Pattern.macther(CharSequence) and you will have the full power of regexp. That's the benefit of interfaces.

class Array2DColumnCharSequnce implements CharSequence {

    private int column;
    private String[][] array2d;
    private int endIndex;
    private int startIndex;

    public Array2DColumnCharSequnce(String[][] array2d, int column) {
        this(array2d, column, 0, array2d[column].length);
        this.array2d = array2d;
        this.column = column;
    }

    public Array2DColumnCharSequnce(String[][] array2d, int column,
            int startIndex, int endIndex) {
        this.array2d = array2d;
        this.column = column;
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }

    public int length() {
        return endIndex - startIndex;
    }

    public char charAt(int index) {
        String charString = array2d[column][startIndex + index];
        return charString.charAt(0);
    }

    public CharSequence subSequence(int start, int end) {
        Array2DColumnCharSequnce array2dColumnCharSequnce = new Array2DColumnCharSequnce(
                array2d, column, start, end);
        return array2dColumnCharSequnce;
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder(this);
        return sb.toString();
    }
}

Try this

public class Main {

    public static void main(String[] args) {
        String[][] array2d = new String[3][5];
        array2d[0][0] = "H";
        array2d[0][1] = "e";
        array2d[0][2] = "l";
        array2d[0][3] = "l";
        array2d[0][4] = "o";

        array2d[1][0] = "W";
        array2d[1][1] = "o";
        array2d[1][2] = "r";
        array2d[1][3] = "l";
        array2d[1][4] = "d";

        array2d[2][0] = "J";
        array2d[2][1] = "a";
        array2d[2][2] = "v";
        array2d[2][3] = "a";
        array2d[2][4] = " ";

        CharSequence columnCharSequnce = new Array2DColumnCharSequnce(array2d,
                0);
        System.out.println(columnCharSequnce.toString());

        Pattern patttern = Pattern.compile("ello");
        Matcher matcher = patttern.matcher(columnCharSequnce);

        System.out.println("ello found in column 0: " + matcher.find());
    }
}

Note: The Array2DColumnCharSequnce is just a quick implementation and it does not address exception handling yet nor it addresses what happens when there are more than one char in a string column.

share|improve this answer
add comment

It is similar to search a subString in a String.

e.g.

A B C D N E X T J H  J  N   E   N   E   X   T   O 

0 1 2 3 4 5 6 7 8 9 10  11  12  13  14  15  16  17

So the answer should be [4-7] and [13-16].

public static List<Integer> findIndexes(String source, String toFind){
    List<Integer> list = new LinkedList<Integer>();//it will return the starting indexes of the found substring, we can easily find the end e=index by adding the length of the other. 
    int start = 0;
    while(start < source.length()){
        if(source.charAt(start)==toFind.charAt(0)){//if the char is same then find whether the whole toFind string is present or not.
            if(isMatch(source, toFind, start)){//if it is found than increment the source pointer to the end after the toFind string
                list.add(start);
                start = start+toFind.length();
                continue;
            }
        }
        start++;
    }
    return list;
}
private static boolean isMatch(String s1, String s2, int srcIndex){
    int desIndex = 0;
    while(desIndex<s2.length() && s1.charAt(srcIndex)==s2.charAt(desIndex)){
        srcIndex++;
        desIndex++;
    }
    if(desIndex==s2.length()){
        return true;
    }
    return false;
}

And sample driver program:

public static void main(String[] args) {    
        String s1="abcdnextponexnextpour";
        String s2 = "next";
        List<Integer> list = findIndexes(s1, s2);
        for(int i : list){
            System.out.println(i);
        }
    }

It will output the indexes:

4
13

i.e. you can add the length of the toFind String to calculate the last index.

share|improve this answer
add comment

I would implement search as follows -

public static List<Interval> search(
    String searchText, String[][] data) {
  List<Interval> al = new ArrayList<>();
  if (searchText != null) {
    searchText = searchText.trim().toUpperCase();
    char[] toMatch = searchText.toCharArray();
    for (int i = 0; i < data.length; i++) {
      if (data[i] != null && data.length > i
          && data[i].length > 0
          && data[i][0].charAt(0) == toMatch[0]) {
        boolean matched = true;
        for (int t = 1; t < toMatch.length; t++) {
          if (i + t > data.length
              || data[i + t][0].charAt(0) != toMatch[t]) {
            i += (t - 1);
            matched = false;
            break;
          }
        }
        if (matched) {
          Interval interval = new Interval();
          interval.start = i - 1;
          interval.end = interval.start + (toMatch.length - 1);
          al.add(interval);
        }
      }
    }
  }
  return al;
}

And, I would modify Interval to add a toString() like this

public String toString() {
  return String.valueOf(start) + "-" + end;
}
share
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.