Building Expression Evaluator with Expression Trees in C# – Part 1

Introduction

This is first part of a series about writing expression evaluator in C#. Even though there are several of them available already I decided to write my own as this one is based on expression trees and is very simple. The evaluator supports numeric expressions as well as expression with parameters.

In this part we will build a simple expression evaluator which will support expressions like this: “2+2″, “2.7+3.2″, “1.7+2.9+14.24+6.58″, “84+15+4-4*3*9+24+4-54/3-5-7+47″, “25/3+1.34*2.56+1.49+2.36/1.48″, etc. It will however not be able to parse expression with parentheses or with variables. These features will be implemented in the coming posts.

Getting Started

As the evaluator is based on expression trees it you should have a basic understanding of how they work. I will not cover them here as there are many good articles available out there.

The first thing we need to do is to make the following to tests pass:

    [TestFixture]
    class ExpressionEvaluatorTests
    {
        private BasicExpressionEvaluator evaluator;
        private Random generator;

        [TestFixtureSetUp]
        public void SetUp()
        {
            evaluator = new BasicExpressionEvaluator();
            generator = new Random();
        }

        [Test]
        public void Empty_String_Is_Zero()
        {
            Assert.That(evaluator.Evaluate("  "), Is.EqualTo(0));
        }

        [Test]
        public void Decimal_Is_Treated_As_Decimal()
        {
            var left = generator.Next(1, 100);
            var right = generator.Next(1, 100);
            decimal input = left + right / 100m;

            Assert.That(evaluator.Evaluate(input.ToString()), Is.EqualTo(input));
        }
    }

The tests are very simple and the necessary code to make them pass is quite straightforward too:

    public class BasicExpressionEvaluator
    {
        public decimal Evaluate(string expression)
        {
            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }
    }

Let’s add support for adding to decimal numbers. Here is the test we need to pass:

        [Test]
        public void Can_Add_Two_Decimal_Numbers()
        {
            Assert.That(evaluator.Evaluate("2.7+3.2"), Is.EqualTo(2.7m + 3.2m));
        }

Implementation is again quite easy. All we have to do it parse the operands and construct corresponding Expression by calling Expression.Add Method.

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');
                Expression sum = Expression.Add(
                       Expression.Constant(decimal.Parse(parts[0])),
                       Expression.Constant(decimal.Parse(parts[1])));
                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

We can also extend it and implement adding multiple numbers:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');

                Expression sum = Expression.Constant(decimal.Parse(parts[0]));

                for (int i = 1; i < parts.Length; i++)
                {
                    sum = Expression.Add(sum, Expression.Constant(decimal.Parse(parts[i])));
                }

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

We can take advantage of Enumerable.Aggregate method and refactor the above snippet like this:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');

                var sum = parts.Aggregate(Expression.Constant(0m) as Expression,
                                     (current, next) =>
                                     Expression.Add(current, Expression.Constant(decimal.Parse(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

Next step is to support subtraction. First we will process expressions like “5-2″ and “15.2-2.3-4.8-0.58″. As with additions we will split the string and subtract the terms:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');

                var sum = parts.Aggregate(Expression.Constant(0m) as Expression,
                                     (current, next) => Expression.Add(current, Expression.Constant(decimal.Parse(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('-'))
            {
                var parts = expression.Split('-');

                Expression sum = Expression.Constant(decimal.Parse(parts[0]));

                for (int i = 1; i < parts.Length; i++)
                {
                    sum = Expression.Subtract(sum, Expression.Constant(decimal.Parse(parts[i])));
                }

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

We can refactor it again by using the Aggregate method again. However with subtraction we cannot start subtracting from zero as we did with addition. Instead we have to start subtracting from the first term:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');

                var sum = parts.Aggregate(Expression.Constant(0m) as Expression,
                                     (current, next) => Expression.Add(current, Expression.Constant(decimal.Parse(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('-'))
            {
                var parts = expression.Split('-');
                Expression sum = Expression.Constant(decimal.Parse(parts[0]));

                sum = parts.Skip(1).Aggregate(sum, (current, next) =>
                                 Expression.Subtract(current, Expression.Constant(decimal.Parse(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

Now, let’s mix additions and subtractions in one expression. We want the following test to pass:

        [Test]
        public void Can_Add_And_Subtract_Multiple_Numbers()
        {
            Assert.That(evaluator.Evaluate("15+8-4-2+7"),
                       Is.EqualTo(15 + 8 - 4 - 2 + 7));

            Assert.That(evaluator.Evaluate("17.89-2.47+7.16"),
                       Is.EqualTo(17.89m - 2.47m + 7.16m));
        }

Currently the test fails with FormatException being thrown as we are trying to parse strings. To avoid the exception and pass the test instead of parsing the terms we will recursively evaluate the term and parse the result. This is how it looks:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');

                var sum = parts.Aggregate(Expression.Constant(0m) as Expression,
                                     (current, next) =>
                                    Expression.Add(current,
                                    Expression.Constant(Evaluate(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('-'))
            {
                var parts = expression.Split('-');
                Expression sum = Expression.Constant(decimal.Parse(parts[0]));

                sum = parts.Skip(1).Aggregate(sum, (current, next) =>
                         Expression.Subtract(current, Expression.Constant(decimal.Parse(next))));

                var lambda = Expression.Lambda<Func<decimal>>(sum);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

Notice that the only change is in the part of the code which processes additions.

Adding multiplication and division

Now we are ready to add support for multiplication and division. Evaluating expressions which contain only multiplication or division is easy so I will jump to expression which contain all operations. As we did with addition and subtraction we need to recursively evaluate the operands before adding or subtracting them. Also, as multiplication and division have higher priority than addition and subtraction we first have to split the input string by ‘+’ and ‘-’ and than split the result by ‘*’ and ‘/’. This way we will first perform higher priority operations first and than move up the stack to perform operations with less priority. For example splitting 2+5*3 by ‘+’ will result in 2 and 5*3 which we will split by ‘*’ and as a result calculate 5*3 before performing addition. With this in mind this is how the corresponding test looks like:

        public decimal Evaluate(string expression)
        {
            if (expression.Contains('+'))
            {
                var parts = expression.Split('+');
                Expression result = Expression.Constant(Evaluate(parts[0]));

                result = parts.Skip(1).Aggregate(result, (current, next) => Expression.Add(current, Expression.Constant(Evaluate(next))));
                var lambda = Expression.Lambda<Func<decimal>>(result);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('-'))
            {
                var parts = expression.Split('-');
                Expression result = Expression.Constant(Evaluate(parts[0]));

                result = parts.Skip(1).Aggregate(result, (current, next) => Expression.Subtract(current, Expression.Constant(Evaluate(next))));
                var lambda = Expression.Lambda<Func<decimal>>(result);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('*'))
            {
                var parts = expression.Split('*');
                Expression result = Expression.Constant(Evaluate(parts[0]));

                result = parts.Skip(1).Aggregate(result, (current, next) => Expression.Multiply(current, Expression.Constant(Evaluate(next))));
                var lambda = Expression.Lambda<Func<decimal>>(result);
                var compiled = lambda.Compile();
                return compiled();
            }

            if (expression.Contains('/'))
            {
                var parts = expression.Split('/');
                Expression result = Expression.Constant(Evaluate(parts[0]));

                result = parts.Skip(1).Aggregate(result, (current, next) => Expression.Divide(current, Expression.Constant(Evaluate(next))));
                var lambda = Expression.Lambda<Func<decimal>>(result);
                var compiled = lambda.Compile();
                return compiled();
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

You can notice that all if block have very similar code: The first term is evaluated and other terms are aggregated based on the current operation. We can take advantage of this and refactor the above code in one small snippet. First, we will introduce a dictionary which maps operation to corresponding expression tree building method. Using it we can replace all if statements with one foreach loop:

        private Dictionary<char, Func<Expression, Expression, Expression>> operations =
            new Dictionary<char, Func<Expression, Expression, Expression>>
                {
                    { '+', (current, next) => Expression.Add(current, next) },
                    { '-', (current, next) => Expression.Subtract(current, next) },
                    { '*', (current, next) => Expression.Multiply(current, next) },
                    { '/', (current, next) => Expression.Divide(current, next) }
                };

        public decimal Evaluate(string expression)
        {
            foreach (var operation in operations)
            {
                if (expression.Contains(operation.Key))
                {
                    var parts = expression.Split(operation.Key);
                    Expression result = Expression.Constant(Evaluate(parts[0]));

                    result = parts.Skip(1).Aggregate(result,
                                                     (current, next) =>
                                                     operation.Value(current, Expression.Constant(Evaluate(next))));

                    var lambda = Expression.Lambda<Func<decimal>>(result);
                    var compiled = lambda.Compile();
                    return compiled();
                }
            }

            decimal value = 0;
            decimal.TryParse(expression, out value);
            return value;
        }

The above function finds the first operation in the given expression and starts recursively evaluating the terms. This in turn will find other operations and evaluate them again recursively until all the operations are performed.

Using the above method you can evaluate any expression as long as it does not contain variables or parentheses. This is also confirmed by the following test results:

test suite of simple expression evaluator

Conclusion

As you can see we can build a very simple expression evaluator very easily using very little lines of code. In the following posts we will add support for expression with parentheses and expressions containing variables. Stay tuned!

Kick It on DotNetKicks.com
Share

, ,

  • Paul Van Bladel

    Splendid !  Will you cover the complete System.Math ? Really great usage of expression trees.

    • http://www.aboutmycode.com/ Giorgi Dalakishvili

      Hi Paul,

      Thank you for your comment. Currently I don’t plan to add support for all members of System.Math but the final source code should be easily extendable.

  • RS

    Hi,
    very nice post!

    The only issue I found is a peculiarity in the operator precedence between multiplication and division. In your code, division is the last operation to be handled, which means that it is always “preferred” to multiplication. The commutative law of multiplication and division however states that the order of evaluation must not play a role when evaluating an expression.

    One can hence observe the internal workings of the evaluator for instance by evaluating “3*2/3″: mathematically, the result should be exactly “2″; however, because of the division winning over multiplication (and the limited precision of the decimal data type), the observed result is “2,0000000000000000000000000001″.

    If the evaluator used left-to-right precedence for division and multiplication, the result would be correct even in this case…

    • http://www.aboutmycode.com/ Giorgi Dalakishvili

      You are correct about division being done before multiplication. That is one of the weaknesses of this approach (including no support for parenthesis and variables) and will be “fixed” in the following parts.

  • hp

    There may occur a bug with digit separator (. or ,) and test data will be not correct.

  • Maarten

    Nice post.

    One issue I have though is that you create an expression tree for each single ‘action’ during your expression evaluation. The expression ’2+3+4′ results in 2 expression trees being evaluated. It would be better if you convert the whole textual expression in an expression tree, which you than compile and execute.

    As it is at the moment, you evaluate an expression tree which is the adding/subtracting/multiplying/dividing of two constant, so using expression tree seems a bit overkill to me. Also performance wise it is much better to just do just ‘return Evaluate(part[0] + Evaluate(part[1]). You already have the building blocks for this with the dictionary operations, which takes two expressions and returns a new expression. Build a big expression, and evaluate it afterwards.

    • http://www.aboutmycode.com/ Giorgi Dalakishvili

      Great comment Maarten. You are correct about performance implications of compiling the expression many twice. Compiling a large expression once is faster and I already plan to use that approach in next posts.

  • Pingback: Building Expression Evaluator with Expression Trees in C# – Part 2 - About My Code

  • Pingback: Building Expression Evaluator with Expression Trees in C# – Table of Contents

More in c#, expression trees
Building Expression Evaluator with Expression Trees in C# – Table of Contents

This is table of contents for Building Expression Evaluator with Expression Trees in C# series. We are going to build...

AppDomain.AssemblyResolve Event Tips

AppDomain.AssemblyResolve event can be used to load referenced assemblies at runtime. This can be useful in multiple scenarios: you can...