Building Expression Evaluator with Expression Trees in C# – Improving the Api

This post is part of a series about building a simple mathematical expression evaluator. For previous posts and full source code see Table of Contents.

Introduction

In previous part we added support for variables but it has a big disadvantage: values of the variables are bound by position and not by name. This means that you should pass values for the variables in the same order as they appear in the expression. For example the following test will fail:

public void Can_Process_Multiple_Variables()
{
    var a = 6;
    var b = 4.32m;
    var c = 24.15m;
    Assert.That(engine.Evaluate("(c+b)*a", a, b, c), Is.EqualTo((c + b) * a));
}

On the other hand binding by name has following advantages: If you change the expression you do not have to worry about keeping order of the variables you pass in sync. So in this post we are going to change expression evaluator to bind parameters by name. We will also add a new overload for better performance when we need to evaluate same expression with different parameter values.

Binding Variables by Name

The easiest way to pass variables with their name is to pass a dictionary of variables with variable name acting as a key. While this will work it is not the most elegant and readable solution. A better solution will be to pass an anonymous object with properties pointing to the variable values. So our first task is to make the following test pass:

public void Can_Process_Multiple_Variables()
{
    var a = 6;
    var b = 4.32m;
    var c = 24.15m;
    Assert.That(engine.Evaluate("(c+b)*a", new { a, b, c }), Is.EqualTo((c + b) * a));
}

Passing Parameters with Anonymous Objects

The first thing we need to do when an object containing variables is passed is to enumerate its properties and build a dictionary of variable name and value. Once we have the dictionary we pass it to the private Enumerate method which we implemented in previous post. The code is pretty straightforward:

public decimal Evaluate(string expression, object argument = null)
{
    if (argument == null)
    {
        return Evaluate(expression, new Dictionary<string, decimal>());
    }

    var argumentType = argument.GetType();

    var properties = argumentType.GetProperties(BindingFlags.Instance | BindingFlags.Public)
                        .Where(p => p.CanRead && IsNumeric(p.PropertyType));

    var arguments = properties.ToDictionary(property => property.Name,
                property => Convert.ToDecimal(property.GetValue(argument, null)));

    return Evaluate(expression, arguments);
}

private bool IsNumeric(Type type)
{
    switch (Type.GetTypeCode(type))
    {
        case TypeCode.SByte:
        case TypeCode.Byte:
        case TypeCode.Int16:
        case TypeCode.UInt16:
        case TypeCode.Int32:
        case TypeCode.UInt32:
        case TypeCode.Int64:
        case TypeCode.UInt64:
        case TypeCode.Single:
        case TypeCode.Double:
        case TypeCode.Decimal:
            return true;
    }
    return false;
}

The Enumerate method has not changed much. The most important change is that it needs to reorder the variables in the same order as they appear in the expression.

private decimal Evaluate(string expression, Dictionary<string, decimal> arguments)
{
    if (string.IsNullOrWhiteSpace(expression))
    {
        return 0;
    }

    if (arguments == null)
    {
        arguments = new Dictionary<string, decimal>();
    }

    var arrayParameter = Expression.Parameter(typeof(decimal[]), "args");

    parameters.Clear();
    operatorStack.Clear();
    expressionStack.Clear();

    using (var reader = new StringReader(expression))
    {
        int peek;
        while ((peek = reader.Peek()) > -1)
        {
            var next = (char)peek;

            if (char.IsDigit(next))
            {
                expressionStack.Push(ReadOperand(reader));
                continue;
            }

            if (char.IsLetter(next))
            {
                expressionStack.Push(ReadParameter(reader, arrayParameter));
                continue;
            }

            if (Operation.IsDefined(next))
            {
                var currentOperation = ReadOperation(reader);

                EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(' &&
                    currentOperation.Precedence <= ((Operation)operatorStack.Peek()).Precedence);

                operatorStack.Push(next);
                continue;
            }

            if (next == '(')
            {
                reader.Read();
                operatorStack.Push('(');
                continue;
            }

            if (next == ')')
            {
                reader.Read();
                EvaluateWhile(() => operatorStack.Count > 0 && operatorStack.Peek() != '(');
                operatorStack.Pop();
                continue;
            }

            if (next == ' ')
            {
                reader.Read();
            }
            else
            {
                throw new ArgumentException(string.Format("Encountered invalid character {0}", next),
                    "expression");
            }
        }
    }

    EvaluateWhile(() => operatorStack.Count > 0);

    var lambda = Expression.Lambda<Func<decimal[], decimal>>(expressionStack.Pop(), arrayParameter);
    var compiled = lambda.Compile();

    var values = parameters.Select(parameter => arguments[parameter]).ToArray();

    return compiled(values);
}

This is all we need to add support for binding arguments by name to our expression evaluator. We will now add another overload for passing parameters to the Evaluate method.

Passing Parameters with Named Arguments

Passing arguments by anonymous objects works but we can do better: We can use named arguments together with dynamic keyword to allow specifying arguments in the following way:

[Test]
public void Can_Pass_Named_Variables()
{
    dynamic dynamicEngine = new ExpressionEvaluator();

    var a = 6;
    var b = 4.5m;
    var c = 2.6m;

    Assert.That(dynamicEngine.Evaluate("(c+b)*a", a: 6, b: 4.5, c: 2.6),
                Is.EqualTo((c + b) * a));
}

As it is based on features introduced in .Net 4 this overload will only be available for users using .Net 4 or upper.

To implement this feature we need to inherit ExpressionEvaluator class from DynamicObject and override TryInvokeMember method. This method will get invoked whenever a method is called which should be resolved at runtime. Parameter names as well as values are passed to this method. This allows us to build a dictionary of arguments and pass them to the Enumerate method showed in this post. This is how we do it:

public override bool TryInvokeMember(InvokeMemberBinder binder, object[] args, out object result)
{
    if ("Evaluate" != binder.Name)
    {
        return base.TryInvokeMember(binder, args, out result);
    }

    if (!(args[0] is string))
    {
        throw new ArgumentException("No expression specified for parsing");
    }

    //args will contain expression and arguments,
    //ArgumentNames will contain only argument names
    if (args.Length != binder.CallInfo.ArgumentNames.Count + 1)
    {
        throw new ArgumentException("Argument names missing.");
    }

    var arguments = new Dictionary<string, decimal>();

    for (int i = 0; i < binder.CallInfo.ArgumentNames.Count; i++)
    {
        if (IsNumeric(args[i + 1].GetType()))
        {
            arguments.Add(binder.CallInfo.ArgumentNames[i], Convert.ToDecimal(args[i + 1]));
        }
    }

    result = Evaluate((string)args[0], arguments);

    return true;
}

Now we can specify parameter names in a more elegant way without having to construct an anonymous object.

Improving Expression Evaluator Performance

Let’s say you have an expression which you want to evaluate multiple times but with different parameters. Currently the Evaluate method will parse and compile the expression tree for every evaluation even thought the expression is same every time. As the compilation of expression trees is an expensive operation it will be better if the library allowed to compile the expression only once and execute it multiple times with different values of variables.

To support this scenario we will add a new method Compile which takes an expression and returns a delegate which you can execute multiple times. Here is an example of what we want to do:

[Test]
public void Can_Invoke_Expression_Multiple_Times()
{
    var a = 6m;
    var b = 3.9m;
    var c = 4.9m;

    var compiled = engine.Compile("(a+b)/(a+c)");
    Assert.That(compiled(new { a, b, c }), Is.EqualTo((a + b) / (a + c)));

    a = 5.4m;
    b = -2.4m;
    c = 7.5m;

    Assert.That(compiled(new { a, b, c }), Is.EqualTo((a + b) / (a + c)));
}

The Compile method is the same as the Evaluate method with the only difference being that it returns the compiled delegate instead of invoking it. This means we can extract a new method and use it from both methods. I will not show the full source code here as it is does not add any new parsing logic and the code is available at github.

Conclusion

In this post we improved expression evaluator and added support for binding variables by name. We also added a new overload for specifying variables as named arguments. Finally we added a new method returning compiled delegate to avoid multiple compilations.

In the next post we will add support for unary negation. Do not miss it

Avatar
Giorgi Dalakishvili
World-Class Software Engineer

Related