Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Here is my code for removing duplicated values from an array. I think I tested it with the most possible cases. Any suggestions or bugs?

public static int[] removeDuplicates(int[] arr){ 

    int end = arr.length;

    for(int i = 0; i < end; i++){
        for(int j = i + 1; j < end; j++){
            if(arr[i] == arr[j]){                  
                int shiftLeft = j;
                for(int k = j+1; k < end; k++, shiftLeft++){
                    arr[shiftLeft] = arr[k];
                }
                end--;
                j--;
            }
        }
    }

    int[] whitelist = new int[end];
    for(int i = 0; i < end; i++){
        whitelist[i] = arr[i];
    }
    return whitelist;
}

After some tests, it appears really inefficient because an array with 1,000,000 elements takes a very long time to end. Is there any better way to implement this on arrays?

EDIT: It's now fixed and more efficient. Are there still any suggestions?

//sorted array of ints as input
public static int[] removeDuplicates(int[] arr){
    //dest array index
    int destination = 0;

    //source array index
    int source = 0;

    int currentValue = arr[0];

    int[] whitelist = new int[arr.length];
    whitelist[destination] = currentValue;
    while(source < arr.length){

        if(currentValue == arr[source]){
            source++;
        } else {
            currentValue = arr[source];
            destination++;
            source++;
            whitelist[destination] = currentValue;
        }
    }
    int[] returnList = new int[++destination];

    for(int i = 0; i < destination; i++){
        returnList[i] = whitelist[i];
    }

    return returnList;
}
share|improve this question
2  
Use Set(Eg HashSet). Duplicate values are not allowed in it. Just iterate over the array and add it to the Set.If you want result in an array copy the set to the target array. –  Aniket Thakur Jul 31 '13 at 9:59
add comment

4 Answers

up vote 5 down vote accepted

You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:

  • Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).
  • Then start removing duplicates (repeated values will be next to each other). In a for loop you could have two indices: source and destination. (On each loop you copy source to destination unless they are the same, and increment both by 1). Every time you find a duplicate you increment source (and don't perform the copy).

UPDATE

Your edited code doesn't work for the case arr = new int[] {1, 2, 2, 3} (the last value [3] is lost). Try moving this line: whitelist[destination] = currentValue; to the else clause and also copy it before your while loop. Like this:

whitelist[destination] = currentValue;
while(source < arr.length){
    if(currentValue == arr[source]){
        source++;
    } else {
        currentValue = arr[source];
        destination++;
        source++;
        whitelist[destination] = currentValue;
    }
}
share|improve this answer
    
Thanks a lot! I implemented this and it changed from almost 20 min to few miliseconds. I'll edit question to review. Thank you! –  ashur Jul 31 '13 at 11:26
    
Answer updated in response to your edit –  morgano Jul 31 '13 at 12:18
    
Fixed, thanks again. Ouch... hope that was last bug :P –  ashur Jul 31 '13 at 12:35
add comment

Suggestion:

public static Integer[] removeDuplicates(Integer[] arr) {
  return new HashSet<Integer>(Arrays.asList(arr)).toArray(new Integer[0]);
}

Another solution might be:

public static int[] removeDuplicates(int[] arr) {
  Set<Integer> alreadyPresent = new HashSet<Integer>();
  int[] whitelist = new int[0];

  for (int nextElem : arr) {
    if (!alreadyPresent.contains(nextElem)) {
      whitelist = Arrays.copyOf(whitelist, whitelist.length + 1);
      whitelist[whitelist.length - 1] = nextElem;
      alreadyPresent.add(nextElem);
    }
  }

  return whitelist;
}

Here you only iterate once via arr.

share|improve this answer
    
Thanks, but I was asked to implement it myself - not to use ready tool. Anyway after some tests it seems it's really inefficient. Array with 1 000 000 elements takes very long time. –  ashur Jul 31 '13 at 9:41
1  
Please see my edited answer. –  András Hummer Jul 31 '13 at 9:53
1  
You could avoid copying the whitelist for every unique element, and start it off by having the same size as the original, then only copy the subarray with unique lements just before returning. –  bowmore Jul 31 '13 at 11:26
1  
@bowmore, please write it as a solution in a separate answer, the credit is yours. –  András Hummer Jul 31 '13 at 13:52
    
Incrementally increasing whitelist by 1 creates a time complexity of O(N^2). Consider using ArrayList which expands dynamically, and returning arraylist.toArray(); That would run in O(N). –  awashburn Aug 10 '13 at 19:47
show 1 more comment

A slight improvement on Dr H's

public static int[] removeDuplicates(int[] arr) {
    Set<Integer> alreadyPresent = new HashSet<>();
    int[] whitelist = new int[arr.length];
    int i = 0;

    for (int element : arr) {
        if (alreadyPresent.add(element)) {
            whitelist[i++] = element;
        }
    }

    return Arrays.copyOf(whitelist, i);
}

This does only one array copy at the end. It also takes advantage of the fact that Set.add() returns a boolean that indicated if the Set changed, and so avoids an explicit contains() check.

share|improve this answer
    
this is brilliant. –  András Hummer Jul 31 '13 at 22:51
add comment

This is a simple way to delete elements in the array:

public class DublicatesRemove {
    public static void main(String args[]) throws Exception {

        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
        System.out.println("enter size of the array");
        int l = Integer.parseInt(br.readLine());
        int[] a = new int[l];
        // insert elements in the array logic
        for (int i = 0; i < l; i++) 
        {
            System.out.println("enter a element");
            int el = Integer.parseInt(br.readLine());
            a[i] = el;
        }
        // sorting elements in the array logic
        for (int i = 0; i < l; i++) 
        {
            for (int j = 0; j < l - 1; j++) 
            {
                if (a[j] > a[j + 1])
                {
                    int temp = a[j];
                    a[j] = a[j + 1];
                    a[j + 1] = temp;
                }
            }
        }
        // remove duplicate elements logic
        int b = 0;
        a[b] = a[0];
        for (int i = 1; i < l; i++)
        {
            if (a[b] != a[i])
            {
                b++;
                a[b]=a[i];
            }
        }
        for (int i = 0; i <= b; i++)
        {
            System.out.println(a[i]);
        }
    }
}
share|improve this answer
add comment

protected by Jamal Jan 22 at 0:26

Thank you for your interest in this question. Because it has attracted low-quality answers, posting an answer now requires 10 reputation on this site.

Would you like to answer one of these unanswered questions instead?

Not the answer you're looking for? Browse other questions tagged or ask your own question.