Delete duplicates in a linked List [closed]

Question

I am reading Cracking the Code Interview and I am getting a bit confused at the bit using 'previous'. In this problem, the method deletes duplicate nodes in a Linked List.

My assumption is that head is used to scan through the linked list while previous is used to update the linked list head because previous points to the same linked list. Not sure if my assumption is correct. Could someone verify this please?

Perhaps someone could also provide a method of increasing the efficiency or space of this program? I would love your input!

public static void deleteDups1(LinkedListNode head) {
    Hashtable<Integer, Boolean> table = new Hashtable<Integer, Boolean>();
    LinkedListNode previous = null;
    while(head != null) {
        if (table.containsKey(head.value)) previous.next = head.next;
        else {
            table.put(head.value, true);
            previous = head;
        }
        head = head.next;
    }
}

This question is off-topic because it contains code lifted from a book rather than code you wrote yourself. — 200_success♦, Jan 24 '14 at 9:29

rolfl · Accepted Answer · 2014-01-24 07:32:42Z

You are right about head, it is used to move through the list. previous is a temporary pointer that is needed to perform the delete. You need it because the delete process means taking the next value of the Node to the left, and making it 'skip' the deleted Node and point over to the right of the deleted Node.

The list is linked by each Node having a next pointer. It would look something like:

A -> B -> A -> C -> D ->

Here the first A points to B, which points to a different A, which points to C, and so on.

You would call your method with the pointer to the first A as the argument head. There is no previous ....

So your system starts off looking like:

     A -> B -> A -> C -> D ->
^    ^
P    H

Where H is the Head pointer, and P is the previous.

We add the value A to the HashMap, and advance our pointers:

    A -> B -> A -> C -> D ->
    ^    ^
    P    H

B is not in the HashMap, so move on....

    A -> B -> A -> C -> D ->
         ^    ^
         P    H

A is in the HashMap, so we have a duplicate. This is where P comes in to play....

We make the B (pointed to by previous) point to C....

    A -> B  --->  C -> D ->
         ^      /
         |    A
         |    ^
         P    H

We do that by making the previous's next point to the head's next. We now have a kind of 'fork' in the list, with the second A dangling part way through....

We have now removed the A from the main link chain, and we need to advance the head pointer to C.... Sunce Nothing is pointing to the second A now, it can be garbage collected, etc... and the final result looks like:

    A -> B  --->  C -> D ->
         ^        ^
         P        H

Now, about your other questions.....

This operation is about as fast as it can get. Not much you can do to improve performance.

Additionally, the operation essentially uses no stoage (there is some stack amount, but this is peanuts in the scheme of things....

If this list is able to cover your use cases, then I would say "Great, use it".

The algorithm uses a Hastable to store all keys. So I'd say it's O(n) which is not exactly nothing (depends on the size of the list) — ChrisWue, Jan 24 '14 at 8:56
thank you for this thorough explanation! you reassured me and I also learn a lot! — Liondancer, Jan 24 '14 at 15:56

ChrisWue · Answer 2 · 2014-01-24 08:55:36Z

@rolfl already explained the algorithm very well, just a remark to the space consumption:

It uses a Hashtable which maps keys to values. However it actually doesn't use the values - it is only interested in the keys. So it should be changed to use a HashSet instead. This can be used to keep a unique collection of entities which is precisely what you want. Three advantages:

The usage of the structure is closer to a HashSet than a HasTable so the better fitting data structure should be used. It makes the code a clearer to read.
In theory it could save some memory as it does not need to store values (which are not needed anyway) associated with the key.

The code can be slightly optimized (only one key lookup instead of two):

public static void deleteDups1(LinkedListNode head) {
    HashSet<Integer> table = new HashSet<Integer>();
    LinkedListNode previous = null;
    while(head != null) {
        if (!table.add(head.value)) {
            // fail to add means it is already present
            previous.next = head.next;
        }
        else {
            previous = head;
        }
        head = head.next;
    }
}

As a side note (and because we are on CR): deleteDups1 is not a really good name. I know that sometimes numbers are used to indicate newer versions of the same method but I'd suggest to rename it deleteDuplicates if you can.

Thank you for these suggestions! i thought about using HashSet would be more appropriate as well but I was just following the book! I appreciate your input thank you! — Liondancer, Jan 24 '14 at 15:58

asked	1 year ago
viewed	140 times
active	1 year ago

current community

your communities

more stack exchange communities

Delete duplicates in a linked List [closed]

closed as off-topic by 200_success♦, palacsint, Lyle's Mug, konijn, Simon André Forsberg Jan 24 '14 at 17:27

2 Answers 2

Not the answer you're looking for? Browse other questions tagged java linked-list or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Delete duplicates in a linked List [closed]

closed as off-topic by 200_success♦, palacsint, Lyle's Mug, konijn, Simon André Forsberg Jan 24 '14 at 17:27

2 Answers 2

Not the answer you're looking for? Browse other questions tagged java linked-list or ask your own question.

Related

Hot Network Questions