Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

In an effort to understand how compilers work, I wrote a simple expression calculator in C#. A CalculatorExpression takes an infix string, converts the infix string to postfix, and finally takes the postfix to an internal BinaryExpression tree representation.

To use, simply create an expression, and then evaluate:

var exp = new CalculatorExpression("1 + (1 + 2*3)"); 
double val = exp.Value;

I'm interested in having my use of access modifiers and general code structure, as well as readability (do I have too many comments/documentation?), reviewed.

First, here is the core code for the tree representation:

/// <summary>
/// Represents a binary operation between two doubles 
/// </summary>
internal delegate double Operator(double x, double y);

/// <summary>
/// Represents a node in BinaryExpression tree
/// </summary>
internal class BinaryExpression
{
    protected BinaryExpression()
    {
    }

    /// <summary>
    /// Constructs a BinaryExpression tree from left and right subtrees
    /// to be combined with an operator
    /// </summary>
    public BinaryExpression(BinaryExpression left, BinaryExpression right,
        Operator op)
    {
        Left = left; Right = right; Operator = op;
    }

    /// <summary>
    /// Returns the value of the tree 
    /// </summary>
    public virtual double Value
    {
        get
        {
            return Operator(Left.Value, Right.Value);
        }
        protected set { } // only child classes (i.e. BinaryAtomic) should be able to set Value
    }

    public BinaryExpression Left;
    public BinaryExpression Right;
    public Operator Operator;
}

/// <summary>
/// Represents a leaf in BinaryExpression tree
/// </summary>
internal class BinaryAtomic : BinaryExpression
{
    protected BinaryAtomic()
    {
    }

    /// <summary>
    /// Constructs a leaf 
    /// </summary>
    public BinaryAtomic(double value)
    {
        Value = value;
    }

    /// <summary>
    /// Returns the leaf value
    /// </summary>
    public override double Value
    {
        get;
        protected set;
    }

    public override string ToString()
    {
        return Value.ToString();
    }
}

And here is the CalculatorExpression class:

/// <summary>
/// Represents an infix mathematical expression involving +, -, *, /, ()
/// </summary>
public class CalculatorExpression
{
    /// <summary>
    /// Constructs an expression tree for infix input
    /// </summary>
    public CalculatorExpression(string input) 
    {
        var nodes = new Stack<BinaryExpression>();
        foreach (var c in ToPostFix(input))
        {
            var s = c.ToString();
            double n;
            if (double.TryParse(s, out n))
            {
                nodes.Push(new BinaryAtomic(n));
            }
            else if (IsOperator(s))
            {
                var r = nodes.Pop();
                var l = nodes.Pop();
                nodes.Push(new BinaryExpression(l, r, Operators[s]));
            }
        }
        System.Diagnostics.Debug.Assert(nodes.Count == 1);
        tree = nodes.Pop();
    }

    /// <summary>
    /// Returns the value of the expression
    /// </summary>
    public double Evaluate
    {
        get { return tree.Value; }
    }

    private BinaryExpression tree;

    /// <summary>
    /// Helper to generate postfix notation. In the constructor, 
    /// the input is first converted to postfix; the postfix is then
    /// used to create a BinaryExpression tree
    /// </summary>
    private static string ToPostFix(string input)
    {
        var postfix = new StringBuilder();

        // A stack is used to hold the operators 
        // because we don't know when we've reached the 
        // end (right operand) of an expression 
        var ops = new Stack<string>();
        foreach (var c in input)
        {
            // When we see an operator, we can pop anything
            // with higher precedence onto the infix. 
            // We do the operations with higher priority first
            var s = c.ToString();
            if (IsOperator(s))
            {
                while (ops.Count > 0 && Precedence(ops.Peek()) >= Precedence(s))
                    postfix.Append(ops.Pop());
                ops.Push(s);
            }
            else
            {
                // When we see an open parenthesis, 
                // we push the paren onto the stack and wait until we
                // see a closing parenthesis. Then we know
                // that the parenthesized expression is complete.
                // While we haven't seen that first opening paren, everything on 
                // the operator stack is popped (part of the inner expression);
                // precedence will be taken care of for us by virtue of the above if-statement
                switch (s)
                {
                    case "(":
                        ops.Push(s);
                        break;

                    case ")":
                        var top = ops.Pop();
                        while (top != "(")
                        {
                            postfix.Append(top);
                            top = ops.Pop();
                        }
                        break;
                    default:
                        postfix.Append(s); // operands always go onto the infix 
                        break;
                }
            }
        }

        System.Diagnostics.Debug.Assert(!ops.Any(x => x == "("));
        while (ops.Count > 0)
            postfix.Append(ops.Pop());
        return postfix.ToString();
    }

    /// <summary>
    /// The only operations currently supported are 
    /// -, +, *, /
    /// </summary>
    private static bool IsOperator(string op)
    {
        return op == "-" || op == "+" ||
               op == "*" || op == "/";
    }

    /// <summary>
    /// Multiplication and division have 
    /// higher precedence than +/-
    /// </summary>
    private static int Precedence(string op)
    {
        if (op == "-" || op == "+")
            return 1;
        if (op == "*" || op == "/")
            return 2;
        return 0;
    }

    private static Dictionary<string, Operator> Operators = new Dictionary<string, Operator>()
    {
        {"+", (x, y) => x + y}, 
        {"-", (x, y) => x - y}, 
        {"*", (x, y) => x * y}, 
        {"/", (x, y) => x / y}, 
    };
}

Correcting for more than single character numbers as input

Thanks to RobH for identifying the issue. Rather than having postfix be of type StringBuilder, we should change to List<string> and maintain a reference to the last character examined (I defined char? last). Inside the switch, we can then change the default as follows:

default:
    if (last.HasValue && char.IsNumber(c) && char.IsNumber(last.Value))
        postfix[postfix.Count - 1] += s;
    else
        postfix.Add(s);
    break;
share|improve this question
2  
If you're interested in understanding compilers, you have to see this! – Mat's Mug Jun 18 '15 at 19:50
    
You haven't named it - but that looks a lot like the shunting yard algorithm for going infix to postfix. Is that what you used intentionally or did you come up with it yourself? – RobH Jun 19 '15 at 7:25
    
@RobH: I used this! – rookie Jun 19 '15 at 11:19
    
@RobH: One thing I just noticed -- the algorithm assumes that parenthesized expressions contain only an operator and an operand (we discard the notion of operator precedence and just pop everything until we see an opening brace). Is this okay? – rookie Jun 19 '15 at 11:31
1  
After the opening brace you continue to push to the stack in the right order so when you pop them back off, the correct precedence is maintained. – RobH Jun 19 '15 at 12:22
up vote 7 down vote accepted

Your code has a pretty massive flaw - it only accepts single digit numbers.

Trying to use new CalculatorExpression("10+1") falls over (specifically your assertion fails). A calculator that can only handle single digit numbers isn't really that useful.

I'd recommend you changing the signature of your ToPostFix method to return an IEnumerable<string> where each string is a token. You could actually create a token class if you want.

So you might have

ToPostFix("10+1");
// ["10", "1", "+"]

Then your parsing code can consume each logical bit at a time.

As I mentioned in my comment, I'm fairly confident this is an implementation of the shunting yard algorith which you may find interesting to read up about.

I never seem to have enough time, there's a lot I'd like to say but I'll limit myself in the interest of at least saying something:

private static bool IsOperator(string op)
{
    return op == "-" || op == "+" ||
           op == "*" || op == "/";
}

Could be rewritten as:

private static bool IsOperator(string op)
{
    return Operators.ContainsKey(op);
}

I don't like the name BinaryAtomic for two reasons:

  1. It's not actaully a binary expression
  2. This doesn't relate to atomicity

A better heirarchy to use would be:

abstract class Expression
{
    // Some factory methods?

    abstract double GetValue();
}

class BinaryExpression : Expression
{
    Operator Operator { get; set; }
    Expression Left { get; set; }
    Expression Right { get; set; }

    override double GetValue() 
    {
        return Operator(Left.GetValue(), Right.GetValue());
    }
}

class NumberExpression : Expression 
{
    double Value { get; set; }

    override double GetValue() 
    {
        return Value;
    }
}

... Time's up, hopefully I'll get a chance to write some more later.

I should point out that your probably don't want to use Expression as the name as it will walk all over the System.Linq.Expressions namespace. I couldn't think of anything better quickly.

share|improve this answer
    
Single digits! I knew the algorithm would have trouble with multiple character operators, but I obviously missed this. – rookie Jun 19 '15 at 13:55
    
When you get time, can you comment on why you like abstract class here more than an interface? – rookie Jun 19 '15 at 14:08
    
@rookie - at this point either is fine. I imagined that later on there might be some things that you could provide an implementation on in the base class but as the code stands, an interface would be equally good. That said, there might be some factory methods like Add, Subtract etc. which could happily live on Expression. – RobH Jun 19 '15 at 14:20
    
I think I fixed the issue you identified. Please see my edits at the end of the post. – rookie Jun 19 '15 at 15:01

Naming:

private BinaryExpression tree;
private readonly BinaryExpression _tree;

Linq:

!ops.Any(x => x == "(")
ops.All(x => x != "(")

Numbers

accepts single digit numbers only

ToPostFix does not deal with unary operators

(x-(-y)-z)

Should be x ~y - z - not x y - - z -, where ~ is unary - (it fails with error).


Robert Sedgewick has a decent book how this is done. Also Java examples are available:

share|improve this answer
    
The code you've linked doesn't take into account precedence though, no? – rookie Jun 19 '15 at 13:58
    
Code example I gave is trivial. – Margus Jun 19 '15 at 14:06

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.