Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I solved this problem from LeetCode:

Given a pattern and a string str, find if str follows the same pattern.

Here follow means a full match, such that there is a bijection between a letter in pattern and a non-empty word in str

Examples:

pattern = "abba", str = "dog cat cat dog" should return true.

pattern = "abba", str = "dog cat cat fish" should return false.

pattern = "aaaa", str = "dog cat cat dog" should return false.

pattern = "abba", str = "dog dog dog dog" should return false.

Notes:

You may assume pattern contains only lowercase letters, and str contains lowercase letters separated by a single space.

I was able to solve this problem by using a Dictionary to keep track of the relation between the pattern and the str:

public class Solution {
public bool WordPattern(string pattern, string str)
    {
        var result = str.Split(' ').ToList();
        var mapPattern = new Dictionary<char, string>();
        if (result.Count != pattern.Count())
        {
            return false;
        }
        string matchstr;
        int index = 0;
        foreach (var c in pattern)
        {
            if (mapPattern.TryGetValue(pattern[index], out matchstr))
            {
                if (matchstr != result[index])
                {
                    return false;
                }
            }
            else 
            {
                if(mapPattern.ContainsValue(result[index]))
                {
                    return false;
                }
                mapPattern.Add(c, result[index]);
            }
            ++index;
        }
        return true;
    }
}

The approach I have currently looks up the Dictionary twice to see if it already exists as a key or as a value in the Dictionary, which I believe would be the reason for slowing this by a significant amount.

Can you determine if I can do away with looking up twice? This solution only runs better than around 36% of the other submissions for this current problem. So, I believe there must be a better and faster way.

share|improve this question

Overall, this is a nice, clean solution, congratulations!

Performance

Can you suggest If I can do away with looking up twice. Currently this solution only runs better that around 36% of the other submissions for this current problem. So, I believe there must be a better and faster way.

"Looking up twice" is not the appropriate term. You do one lookup by key, and another lookup by value. These are very different operations, and lumping them together as "two lookups" hides important details. Imagine a program that does one lookup in an array and one lookup with a Google search on the internet, and the author would wonder if slowness might be caused by doing "two lookups". It would hide a crucial detail that one of the "lookups" is clearly much slower than the other.

The lookup in the dictionary by key is very fast, an \$O(1)\$ operation.

The lookup by value in a typical dictionary (or hash map) implementation is much slower, \$O(n)\$, because dictionaries are indexed by key, not by value. They are designed for fast lookups by key, not by value.

To make the lookup by value faster, you can add a set for values.

Scope

It's good to limit the scope of variables to the minimum needed.

  • matchstr should be declared inside the loop, as it is not needed outside
  • mapPattern should be initialized after the early return, as it might not be needed
share|improve this answer
    
Thanks for the tip. My initial solution had two Dictionary like you had suggest but it execute a bit slower than this one. I'll try that one once again as well. Also leetcode run time can be unpredictable at times . – thebenman 6 hours ago

Just a few small issues, the main thing (using a HashSet to keep track of the values) has already been covered.


var result = str.Split(' ').ToList();

You don't need this to be a List - it's fine as string[].

result is a bad name. May I suggest words or tokens?


var mapPattern = new Dictionary<char, string>();
if (result.Count != pattern.Count())

You should check for the early return before creating your dictionary. With result renamed to words and as an array, the if would be better as:

if (words.Length != pattern.Length)

string matchstr;

word or token would be a better choice depending on what you call the result of string.Split().


if (mapPattern.TryGetValue(pattern[index], out matchstr))
{
    if (matchstr != result[index])
    {

You can combine these:

if (mapPattern.TryGetValue(pattern[index], out word) 
    && word != words[index])
{
    return false;
}

which saves you some indentation.


Overall, it's a good approach IMO. I submitted a similar approach with an extra hashset and first got a speed ~15% and then submitted it again and got a speed at ~85% so I think it's just pot luck.

share|improve this answer

Not much but you could change index

foreach (int index = 0; index < pattern.Count; index++)
{

Use a HashSet to store current values

A related approach

public static bool PatterMatch(string pattern, string match)
{
    Dictionary<char, int> dlPattern = new Dictionary<char, int>();
    Dictionary<string, int> dlMatch = new Dictionary<string, int>();
    List<int> lPattern = new List<int>();
    List<int> lMatch = new List<int>();
    int index = 0;
    int indexOut;
    foreach(char p in pattern.ToCharArray())
    {
        if (dlPattern.TryGetValue(p, out indexOut))
        {
            lPattern.Add(indexOut);
        }
        else
        {
            dlPattern.Add(p, index);
            lPattern.Add(index);
            index++;
        }
    }
    index = 0;
    foreach (string s in match.Split(' '))
    {
        if (dlMatch.TryGetValue(s, out indexOut))
        {
            lMatch.Add(indexOut);
        }
        else
        {
            dlMatch.Add(s, index);
            lMatch.Add(index);
            index++;
        }
    }
    return lPattern.SequenceEqual(lMatch);
}
share|improve this answer
    
-1 for the Hungarian notation and lack of reasoning about how that approach is different and why it might be preferred. – RobH 2 hours ago
    
@RobH Sorry the difference and why it might be preferred is not apparent to you. – Paparazzi 2 mins ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.