vote up 0 vote down star

I'm outputting a list of purchases, and I want to automatically highlight the presence of duplicate orders.

Here's what the array looks like. The first two orders are duplicate orders place by mistake. You'll notice that the orderid for each is different, while the email and userid remain the same. So the duplication will need to match on email and / or userid, but not on orderid.

array
  0 => 
    array
      'orderid' => string '2009091008261662'
      'email' => string '[email protected]'
      'userid' => string '53'
array
  1 => 
    array
      'orderid' => string '2009091008261048'
      'email' => string '[email protected]'
      'userid' => string '53'
array
  2 => 
    array
      'orderid' => string '2009091008262025'
      'email' => string '[email protected]'
      'userid' => string '103'
array
  3 => 
    array
      'orderid' => string '2009091008272082'
      'email' => string '[email protected]'
      'userid' => string '392'

How can I search for duplicate orders from the same person in a given array, in PHP?

I would like to output the above like so:

(pretend its in a table)

2009091008261662 - [email protected] - 53

2009091008261048 - [email protected] - 53

2009091008262025 - [email protected] - 103

2009091008272082 - [email protected] - 392

... so basically just highlight the two ( or more ) duplicates.

flag

79% accept rate

6 Answers

vote up 0 vote down

Your best bet would be essentially "invert" the array into an associative one mapping values to keys from the original array:

$emails = array();
$userids = array();

foreach($inputarray as $key => $item) {
    if( isset($emails[$item['email']]) || isset($userids[$item['userid']]) ) {
        // This item has a duplicate email or userid as something already looked at!
        // $emails[$item['email']] or $userids[$item['userid']] has the key corresponding to the original location where it was seen.
        // $key has the key corresponding to the duplicate we just found.
    } else {
        $emails[$item['email']] = $key;
        $userids[$item['userid']] = $key;
    }
}
link|flag
vote up 0 vote down

Assumes uniqueness based on userid value

<?php

$orders = array(
  array(
    'orderid' => '2009091008261662',
    'email' => '[email protected]',
    'userid' => '53'
  ),
  array(
    'orderid' => '2009091008261048',
    'email' => '[email protected]',
    'userid' => '53'
  ),
  array(
    'orderid' => '2009091008262025',
    'email' => '[email protected]',
    'userid' => '103'
  ),
  array(
    'orderid' => '2009091008272082',
    'email' => '[email protected]',
    'userid' => '392'
  ),
  array(
    'orderid' => '2009091008265555',
    'email' => '[email protected]',
    'userid' => '53'
  )
);

$foundIds = array();
foreach ( $orders as $index => $order )
{
  if ( isset( $foundIds[$order['userid']] ) )
  {
    $orders[$index]['is_dupe'] = true;
    $orders[$foundIds[$order['userid']]]['is_dupe'] = true;
  } else {
    $orders[$index]['is_dupe'] = false;
  }
  $foundIds[$order['userid']] = $index;
}
?>

<style type="text/css">
tr.dupe td {
  font-weight: bold;
}
</style>

<table>
  <tr><th>orderid</th><th>email</th><th>
  <?php foreach ( $orders as $order ) { ?>
  <tr class="<?php echo $order['is_dupe'] ? 'dupe' : '' ?>">
    <td><?php echo $order['orderid']; ?></td>
    <td><?php echo $order['email']; ?></td>
    <td><?php echo $order['userid']; ?></td>
  </tr>
  <?php } ?>
</table>
link|flag
haha funny, I came up with the same basic concept. – Dooltaz Sep 11 '09 at 23:39
vote up 0 vote down

You could add a hash to the inner-array which represents the the array. Just loop through and compare the hashes.

link|flag
vote up 0 vote down

This code works...

$array1[0]['orderid'] = '2009091008261662';
$array1[0]['email'] = '[email protected]';
$array1[0]['userid'] = '53';
$array1[1]['orderid'] = '2009091008261662';
$array1[1]['email'] = '[email protected]';
$array1[1]['userid'] = '53';
$array1[2]['orderid'] = '2009091008261662';
$array1[2]['email'] = '[email protected]';
$array1[2]['userid'] = '53';
$array1[3]['orderid'] = '209091008261662';
$array1[3]['email'] = '[email protected]';
$array1[3]['userid'] = '53';
$array1[4]['orderid'] = '2001008261662';
$array1[4]['email'] = '[email protected]';
$array1[4]['userid'] = '53';
$array1[5]['orderid'] = '20013344008261662';
$array1[5]['email'] = '[email protected]';
$array1[5]['userid'] = '53';
$array1[6]['orderid'] = '200133352008261662';
$array1[6]['email'] = '[email protected]';
$array1[6]['userid'] = '53';


$unique_array = array(); // Filtered array with no dupes
$email_array = array(); // Hash list
$order_array = array(); // Hash list
foreach($array1 as $i => $row) {

 if (array_key_exists($row['email'], $email_array)) {
  // This is a dupe based on email
  $array1[$i]['duplicate'] = 1;
  $array1[$email_array[$row['email']]]['duplicate'] = 1;
 }

 if (array_key_exists($row['orderid'], $order_array)) {
  // This is a dupe based on email
  $array1[$i]['duplicate'] = 1;
  $array1[$order_array[$row['orderid']]]['duplicate'] = 1;
 }
 $order_array[$row['orderid']] = $i;
 $email_array[$row['email']] = $i;
}
foreach($array1 as $i => $row) {
 if (!empty($row['duplicate'])) {
  echo "<b>" . $row['orderid'] . $row['email'] . "</b>\n";
  unset($row['duplicate']); // reset the array to original form
 } else {
  echo $row['orderid'] . $row['email'] . "\n";
 }
}
link|flag
vote up 0 vote down

You'll need two passes of the orders array. But it's really more simple than some have made it out to be:

$duplicateUserId = array();

// Mark user ID's with more than one order
foreach ( $orders as $order ) {
    $duplicateUserId[$order['userid']] = isset($duplicateUserId[$order['userid']]);
}

// Output each order
foreach ( $orders as $order ) {
    echo formatOrder($order, $duplicateUserId[$order['userid']]);
}

// Format the output of each order
function formatOrder($order, $isDuplicated) {
    // yadda yadda yadda
}

Assuming that $orders looks like

$orders = array(
  array(
    'orderid' => '2009091008261662',
    'email' => '[email protected]',
    'userid' => '53'
  ),
  array(
    'orderid' => '2009091008261048',
    'email' => '[email protected]',
    'userid' => '53'
  ),
  array(
    'orderid' => '2009091008262025',
    'email' => '[email protected]',
    'userid' => '103'
  ),
  array(
    'orderid' => '2009091008272082',
    'email' => '[email protected]',
    'userid' => '392'
  ),
  array(
    'orderid' => '2009091008265555',
    'email' => '[email protected]',
    'userid' => '53'
  )
);

Also, it might be best to only match on userId since, presumably, users can change their emails and emails are unique to a single user.

link|flag
vote up -1 vote down

Simple answer:

function hasDuplicate($arr,$email) {
  $count = 0;
  foreach ($arr as $row) {
     if ($row['email'] == $email) {
       $count++;
     }
  }
  return ($count >1);
}
link|flag
-1 for O(n^2) when it could just be O(2n) -> O(n). Also, you could just return true as soon as $count is not 0 instead of iterating of the whole array. – Justin Johnson Sep 11 '09 at 23:47
I said it was simple not efficient. Beware of premature optimization. And yes you could return true as soon $count is > 1. But not 0 as there will always be 1 match. – Craig Sep 12 '09 at 2:40

Your Answer

Get an OpenID
or
never shown

Not the answer you're looking for? Browse other questions tagged or ask your own question.