Sorting array algorithm - duplicate values

Question

I have an Integers array size N with duplicates values and I don't know the range of the values. I have n/logn Different values in this array, and all the rest are duplicates.

Is there a way to sort it with Time complexity of O(n) and memory complexity of O(n/logn)?

yes, I think you can sort that array in O(n) under those conditions. Is this homework? — Andras, Dec 22 '14 at 18:27
let's look at it step by step. Sorting is an O(m log m) operation. You have m = n/logn items, and O(m * log m) = O(n/logn * log(n/logn)) <= O(n), so we can sort the different values in O(n) time as long as we can do a duplicate check in <= O(logn) time (because that too would be O(n/logn * logn), which doubles the runtime but O(2n) = O(n)). To detect duplicates in O(logn) you'd need some kind of balanced binary tree with O(logn) maintenance cost. Or there may be a clever way of using a sort algorithm that keeps a sorted array while it progresses, then you can check in the array so far — Andras, Dec 22 '14 at 21:25
Andras, this is not quite right. With a bbst, the duplicate check consumes O(n*log(n/log(n))) time. In my answer, I use a hash table to achieve expected O(n) time for the duplicate check. — Kevin L. Stern, Dec 23 '14 at 15:01

Ali Akber · Answer 1 · 2014-12-22 20:28:05Z

up vote 0 down vote

May be you can use this :

#include <iostream>
#include <algorithm>
#include <vector>
#include <map>
using namespace std;
int main()
{
    int n=7;

    int v[]={4,1,2,3,2,4,1}; // initial array

    map<int,int>mp;
    vector<int>uni;

    for(int i=0;i<n;i++)
      if(++mp[v[i]]==1) // counting instance of unique values
        uni.push_back(v[i]); // unique value

    sort(uni.begin(),uni.end()); // sorting : O ((n / logn) * logn) = O(n)

    int cur=0;
    for(int i=0;i<uni.size();i++)
    {
        int cnt=mp[uni[i]];
        while(cnt)  // Adding duplicate values
        {
            v[cur++]=uni[i];
            cnt--;
        }
    }

    for(int i=0;i<n;i++) // Printing final sorted array
      cout<<v[i]<<" ";
    cout<<"\n";
return 0;
}

Here uni array keeps the unique values, that is maximum n / logn values.
Then we used stl sort() function having time complexity O (n * logn)
As here total element n = n / logn. The complexity will be O ((n / logn) * logn) = O(n).

So we see that, above method works with O(n) complexity with O(n * logn) memory.

Here map<> is used to count the number of times each distinct value appears.

edited Dec 22 '14 at 20:28

answered Dec 22 '14 at 20:15

Ali Akber
1,7971824

updating mp[v[i]] could have an O(n^2) worst-case cost (if all hash keys collide). The hash algorithm and data structure would also have to be specified to gaurantee O(logn) lookups and updates – Andras Dec 22 '14 at 21:32

How can you say O(n^2) for worst case ? The complexity of lookup for map<> (sorted) is O(log N). So complexity will be O(n * logn) maximum in worst case. – Ali Akber Dec 23 '14 at 4:14

This is not quite right. With a bbst, the duplicate check consumes O(n*log(n/log(n))) time. In my answer, I use a hash table to achieve expected O(n) time for the duplicate check. As Andras mentions above, the worst case lookup/insertion for a hash table is O(n); however, they are typically used as though they are amortized O(1) time (i.e. they are typically used as though the hashing function is perfect). – Kevin L. Stern Dec 23 '14 at 15:03

@KevinL.Stern worst case lookup for map is O(logn). It uses binary search to search a value – Ali Akber Dec 23 '14 at 15:22

Yes, but you are doing that lookup n times (for each element of the original array). Since mp will only grow to size n/log(n) (because it only contains non-duplicates), that gives O(n*log(n/log(n))) for the duplicate check. – Kevin L. Stern Dec 23 '14 at 15:26

| show 3 more comments

Arun Tyagi · Answer 2 · 2015-02-05 08:06:17Z

Simply keep key value pair

 for(i=0;i<n;i++)
    {
int key = a[i];
//       if value is in map increase vale of this key
// else add key with value 1 
    }

now write this map data to output array using merge sort hope this will solve your problem..

Robin Hartland · Answer 3 · 2014-12-22 18:25:48Z

up vote -1 down vote

Bubble sort takes O(n^2) time and needs virtually no memory to execute (other than to store the original array) - see http://rosettacode.org/wiki/Sorting_algorithms/Bubble_sort . Quick sort is much quicker though - see http://rosettacode.org/wiki/Quick_Sort

Duplicates don't affect either of these algorithms

answered Dec 22 '14 at 18:25

Robin Hartland
18919

1

O(n^2) you mean – Andras Dec 22 '14 at 18:26

@robin But bubble sort is O(n) in the good case. I need algorithm which is O(n) in the avarage case. – user3885474 Dec 22 '14 at 18:29

add a comment |

Kevin L. Stern · Answer 4 · 2014-12-22 19:03:31Z

Mergesort, e.g., is an O(n*lg(n)) time algorithm with O(n) space complexity.

You can use a hash map to extract the n/lg(n) unique items into their own array and note the number of times each item occurs; this is (expected) O(n) time and O(n/lg(n)) space. Now, you can run Mergesort on the new array, which is:

O(x*lg(x)) time with x=n/lg(n) ==
O(n/lg(n) * lg(n/lg(n))) ==
O(n/lg(n) * [lg(n) - lg(lg(n))]) ==
O(n - n*lg(lg(n))/lg(n))
<= O(n) time

and

O(x) space with x=n/lg(n) ==
O(n/lg(n)) space

Finally, expand the ordered array into the final result by duplicating elements based upon the duplication number noted in the hash map.

asked	7 months ago
viewed	166 times
active	6 months ago

current community

your communities

more stack exchange communities

Sorting array algorithm - duplicate values

4 Answers 4

Your Answer

Not the answer you're looking for? Browse other questions tagged sorting array-algorithms or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Sorting array algorithm - duplicate values

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged sorting array-algorithms or ask your own question.

Related

Hot Network Questions