Take the 2-minute tour ×
Programming Puzzles & Code Golf Stack Exchange is a question and answer site for programming puzzle enthusiasts and code golfers. It's 100% free, no registration required.

I recently saw this Javascript code on StackOverflow for merging two arrays, and removing duplicates:

Array.prototype.unique = function() {
    var a = this.concat();
    for(var i=0; i<a.length; ++i) {
        for(var j=i+1; j<a.length; ++j) {
            if(a[i] === a[j])
                a.splice(j--, 1);
        }
    }
    return a;
};

var array1 = ["Vijendra","Singh"];
var array2 = ["Singh", "Shakya"];
var array3 = array1.concat(array2).unique(); 

While this code works, it is horribly inefficient (O(n^2)). Your challenge is to make an algorithm with less complexity.

The winning criteria is the solution with the least complexity, but ties will be broken by shortest length in characters.

Requirements:

Package all your code together in a function that meets the following requirements for "correctness:"

  • Input: Two arrays
  • Output: One array
  • Merges elements of both arrays together- Any element in either input array must be in the outputted array.
  • The outputted array should have no duplicates.
  • Order doesn't matter (unlike the original)
  • Any language counts
  • Don't use the standard library's array functions for detecting uniqueness or merging sets/arrays (although other things from the standard library is okay). Let me make the distinction that array concatenation is fine, but functions that already do all of the above are not.
share|improve this question
    
How are we supposed to create or append to an array without using array functions? –  Emil Vikström Jan 2 at 1:11
    
@EmilVikström See my edit. I meant that you can't use array uniqueness functions. Sorry for being unclear. –  troglodite Jan 2 at 1:12
    
If one of the arrays has duplicates in it, do we remove them as well? For example, should merging [1, 2, 2, 3] and [2, 3, 4] return [1, 2, 2, 3, 4] or [1, 2, 3, 4]? –  O-I Jan 2 at 1:18
1  
@O-I Yes, that would make it too easy. –  troglodite Jan 2 at 1:24
1  
May I ask: Arrays of what? Can we assume simply integers or strings, or do we also have to allow more complex things like multilevel objects? –  jawns317 Jan 2 at 1:30
show 5 more comments

12 Answers

up vote 3 down vote accepted

Perl

27 Characters

Simple Perl Hack

my @vals = ();
push @vals, @arr1, @arr2;
my %out;
map { $out{$_}++ } @vals;
my @unique = keys %out;

I'm sure someone could one-liner this.. and thus (Thanks Dom Hastings)

sub x{$_{$_}++for@_;keys%_}
share|improve this answer
1  
"Don't use the standard library's array functions for detecting uniqueness (although other things form the standard library is okay)" –  Jan Dvorak Jan 2 at 7:17
    
How I am violating that rule? I'm not using unique functions? –  Zach Leighton Jan 2 at 11:51
    
How does it work, then? Sorry, I can't read perl. If it reads the keys of a hash map - does that count as OK with that rule? I won't vote until convinced that it is. –  Jan Dvorak Jan 2 at 11:54
    
It combines the arrays, loops over both and adds to a hash incrementing the value who's key is the current value in the array loop. Then it takes the keys of that hash, I have used this in some of my work.. So [1,1,2,3,4,4] becomes { 1 => 2, 2 => 1, 3 => 1, 4 => 2 } –  Zach Leighton Jan 2 at 11:59
    
@ZachLeighton you can shorten the code to 27 chars with sub x{$_{$_}++for@_;keys%_} (in case it comes down to a tie!) and use as: z((1,2,3,4),(2,3,4,5,6)) –  Dom Hastings Jan 2 at 12:13
show 3 more comments

JavaScript O(N) 131 124 116 92 (86?)

Golfed version:

function m(i,x){h={};n=[];for(a=2;a--;i=x)i.map(function(b){h[b]=h[b]||n.push(b)});return n}

Human readable golfed version:

function m(i,x) {
   h = {}
   n = []
   for (a = 2; a--; i=x)
      i.map(function(b){
        h[b] = h[b] || n.push(b)
      })
   return n
}

I could use concat like so and do it in 86 characters:

function m(i,x){h={};n=[];i.concat(x).map(function(b){h[b]=h[b]||n.push(b)});return n}

But I am not sure if it is still O(N) based on this JsPerf: http://jsperf.com/unique-array-merging-concat-vs-looping as the concat version is marginally faster with smaller arrays but slower with larger arrays (Chrome 31 OSX).

In practice do this (golf is full of bad practices):

function merge(a1, a2) {
   var hash = {};
   var arr = [];
   for (var i = 0; i < a1.length; i++) {
      if (hash[a1[i]] !== true) {
        hash[a1[i]] = true;
        arr[arr.length] = a1[i];
      }
   }
   for (var i = 0; i < a2.length; i++) {
      if (hash[a2[i]] !== true) {
        hash[a2[i]] = true;
        arr[arr.length] = a2[i];
      }
   }
   return arr;
}
console.log(merge([1,2,3,4,5],[1,2,3,4,5,6]));

I'm not great at computing complexity but I believe this is O(N). Would love if someone could clarify.

Edit: Here is a version that takes any number of arrays and merges them.

function merge() {
   var args = arguments;
   var hash = {};
   var arr = [];
   for (var i = 0; i < args.length; i++) {
      for (var j = 0; j < args[i].length; j++) {
        if (hash[args[i][j]] !== true) {
          arr[arr.length] = args[i][j];
          hash[args[i][j]] = true;
        }
      }
    }
   return arr;
}
console.log(merge([1,2,3,4,5],[1,2,3,4,5,6],[1,2,3,4,5,6,7],[1,2,3,4,5,6,7,8]));
share|improve this answer
    
This is almost exactly what I was gonna post in a couple of seconds :-( Yes, it is amortized linear time if hash tables are implemented with amortized constant time for insertion and searching (which is common in many languages, don't know specifically about JS). –  Emil Vikström Jan 2 at 1:14
    
@EmilVikström Thanks for that I believe JavaScript does but don't have proof on it. Apologies for having fast fingers, slowed yourself down with comments :P –  George Reith Jan 2 at 1:17
    
This is a great approach. However, could you also provide a "code-golf" style solution in addition to your nicely formatted version? Seeing that multiple people have thought of this as the right approach, there's probably going to be a tie at O(N). –  troglodite Jan 2 at 1:21
    
@cloudcoder2000 Ok, I wanted to print a full version as the code-golf version is likely to be less efficient in practice. –  George Reith Jan 2 at 1:24
1  
@cloudcoder2000 They are not fully independent so the worst case is not O(A*B) (Not using N because it's confusing). It would be that if every input array (every A) had the same amount of elements (B) as it is actually O(SUM(B) FOR ALL A), which can be rewritten as O(N) when defining N as the count of elements of all array inputs. –  meiamsome Jan 2 at 4:17
show 18 more comments

Python 2.7, 38 chars

F=lambda x,y:{c:1 for c in x+y}.keys()

Should be O(N) assuming a good hash function.

Wasi's 8 character set implementation is better, if you don't think it violates the rules.

share|improve this answer
    
Nice! Comprehensions in Python can be so elegant and powerful. –  O-I Jan 2 at 17:23
add comment
Array.prototype.unique = function()
{
  var o = {},i = this.length
  while(i--)o[this[i]]=true
  return Object.keys(o)
}

A function that would take n arrays could be the following:

function m()
{
  var o={},a=arguments,c=a.length,i;
  while(c--){i=a[c].length;while(i--)o[a[c][i]] = true} 
  return Object.keys(o);
}

Golfed, I think this should work ( 117 chars )

function m(){var o={},a=arguments,c=a.length,i;while(c--){i=a[c].length;while(i--)o[a[c][i]]=1}return Object.keys(o)}

Update If you want to keep the original type, you could

function m()
{
  var o={},a=arguments,c=a.length,f=[],g=[];
  while(c--)g.concat(a[c])
  c = g.length      
  while(c--){if(!o[g[c]]){o[g[c]]=1;f.push(g[c])}}
  return f
}

or golfed 149:

function m(){var o={},a=arguments,c=a.length,f=[],g=[];while(c--)g.concat(a[c]);c= g.length;while(c--){if(!o[g[c]]){o[g[c]]=1;f.push(g[c])}}return f}

This still can cast some doubts, if you want to distinguish 123 and '123', then this would not work..

share|improve this answer
    
Thanks for the answer. It is impressively short, however this only does half the problem. You also need to include in the solution the actual merging part (even if its the same as in the original example) and put it all together in one function. Also, could you provide "golfed" version in addition to this (as it is O(N))? –  troglodite Jan 2 at 2:16
    
This casts all members into strings. e.g. m([1,2,3,4,5],[2,3,4,5,6],[2,3,4,5,6,7]) becomes ["1", "2", "3", "4", "5", "6", "7"] –  George Reith Jan 2 at 2:22
add comment

python,46

def A(a,b):print[i for i in b if i not in a]+a

Or, using set operation simply

python, 8

set(a+b)
share|improve this answer
    
Sorry it wasn't clear, using set operations is also cheating. –  troglodite Jan 2 at 19:09
add comment

Perl

23 bytes, if we only count the code block inside subroutine. Could be 21, if overwriting global values is allowed (it would remove my from the code). It returns elements in random order, because order doesn't matter. As for complexity, on average it's O(N) (depends on number of hash collisions, but they are rather rare - in worst case it can be O(N2) (but this shouldn't happen, because Perl can detect pathological hashes, and changes the hash function seed when it detects such behavior)).

use 5.010;
sub unique{
    my%a=map{$_,1}@_;keys%a
}
my @a1 = (1, 2, 3, 4);
my @a2 = (3, 4, 5, 6);
say join " ", unique @a1, @a2;

Output (also showing randomness):

/tmp $ perl unique.pl 
2 3 4 6 1 5
/tmp $ perl unique.pl 
5 4 6 2 1 3
share|improve this answer
add comment

Fortran: 282 252 233 213

Golfed version:

function f(a,b,m,n) result(d);integer::m,n,a(m),b(n),c(m+n);integer,allocatable::d(:);j=m+1;c(1:m)=a(1:m);do i=1,n;if(.not.any(b(i)==c(1:m)))then;c(j)=b(i);j=j+1;endif;enddo;allocate(d(j-1));d=c(1:j-1);endfunction

Which not only looks infinitely better but will actually compile (too long a line in its golfed form) with the human-readable form:

function f(a,b,m,n) result(d)
  integer::m,n,a(m),b(n),c(m+n)
  integer,allocatable::d(:)
  j=m+1;c(1:m)=a(1:m)
  do i=1,n
     if(.not.any(b(i)==c(1:m)))then
        c(j)=b(i);j=j+1
     endif
  enddo
  allocate(d(j-1))
  d=c(1:j-1)
end function

This should be O(n) as I copy a into c and then check each b against all of c. The last step is to eliminate the garbage that c will contain since it is uninitialized.

share|improve this answer
add comment

One way in Ruby

To keep within the rules outlined above, I would use a similar strategy as the JavaScript solution and use a hash as an intermediary.

merged_arr = {}.tap { |hash| (arr1 + arr2).each { |el| hash[el] ||= el } }.keys

Essentially, these are the steps I'm going through in the line above.

  1. Define a variable merged_arr that will contain the result
  2. Initialize an empty, unnamed hash as an intermediary to put unique elements in
  3. Use Object#tap to populate the hash (referenced as hash in the tap block) and return it for subsequent method chaining
  4. Concatenate arr1 and arr2 into a single, unprocessed array
  5. For each element el in the concatenated array, put the value el in hash[el] if no value of hash[el] currently exists. The memoization here (hash[el] ||= el) is what ensures the uniqueness of elements.
  6. Fetch the keys (or values, as they are the same) for the now populated hash

This should run in O(n) time. Please let me know if I've made any inaccurate statements or if I can improve the above answer either for efficiency or readability.

Possible improvements

Using memoization is probably unnecessary given that the keys to the hash are going to be unique and the values are irrelevant, so this is sufficient:

merged_arr = {}.tap { |hash| (arr1 + arr2).each { |el| hash[el] = 1 } }.keys

I really love Object#tap, but we can accomplish the same result using Enumerable#reduce:

merged_arr = (arr1 + arr2).reduce({}) { |arr, val| arr[val] = 1; arr }.keys

You could even use Enumberable#map:

merged_arr = Hash[(arr1 + arr2).map { |val| [val, 1] }].keys

How I would do it in practice

Having said all that, if I were asked to merge two arrays arr1 and arr2 such that the result merged_arr has unique elements and could use any Ruby method at my disposal, I would simply use the set union operator which is intended for solving this exact problem:

merged_arr = arr1 | arr2

A quick peek at the source of Array#|, though, seems to confirm that using a hash as an intermediary seems to be the acceptable solution to performing a unique merge between 2 arrays.

share|improve this answer
    
"Don't use the standard library's array functions for detecting uniqueness (although other things form the standard library is okay)" –  Jan Dvorak Jan 2 at 11:55
    
How am I violating that rule in the second example? Memoization is being performed on a hash. Is that not allowed either? –  O-I Jan 2 at 13:54
add comment

Mathematica 10 Chars

Union[a,b]

Example:

a={1,2,3,4,5};
b={1,2,3,4,5,6};
Union[a,b]

{1, 2, 3, 4, 5, 6}

Mathematica2 43 Chars

Sort@Join[a, b] //. {a___, b_, b_, c___} :> {a, b, c}
share|improve this answer
4  
I think this would go in the category of using standard library array methods. –  troglodite Jan 2 at 4:02
    
Hi @cloudcoder2000. No need to call some specific library to use Union in Mathematica. –  Murta Jan 2 at 10:44
3  
In my opinion, using a builtin function to do exactly what the question is asking to do is cheating. –  xfix Jan 2 at 14:37
    
ok ok .. the second code do not use Union. –  Murta Jan 2 at 16:57
1  
I guess Tally[Join[a, b]][[;; , 1]] would also be cheating ;-) BTW you could save chars by using single-letter variables. –  Yves Klett Jan 2 at 19:22
show 1 more comment

PHP, 69/42 68/41 chars

Including the function declaration is 68 characters:

function m($a,$b){return array_keys(array_flip($a)+array_flip($b));}

Not including the function declaration is 41 characters:

array_keys(array_flip($a)+array_flip($b))
share|improve this answer
add comment

Javascript 86

Golfed version:

function m(a,b){var h={};return a.concat(b).filter(function(v){return h[v]?0:h[v]=1})}

Readable version:

function merge(a, b) {
  var hash = {};
  return a.concat(b).filter(function (val) {
    return hash[val] ? 0 : hash[val] = 1;
  });
}
share|improve this answer
1  
This ignores falsey values... m([1,0,0,0,0],[0,1,0]) returns [1]. –  George Reith Jan 3 at 12:30
1  
Change h[v]=v to h[v]=1. –  George Reith Jan 3 at 12:36
    
Well spotted @GeorgeReith! We went from 86 to 84 :) –  Bertrand Jan 3 at 12:42
    
It's still 86, I think you got confused because you removed 2 characters from the readable version not the golfed one. –  George Reith Jan 3 at 15:37
add comment

JavaScript 60

I'm using ES6 generator.
The following is testable using Google's Traceur REPL.

m=(i,j)=>{h={};return[for(x of i.concat(j))if(!h[x])h[x]=x]}
share|improve this answer
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.