4
\$\begingroup\$

I am trying to solve the 909th challenge of Project Euler. It is basically about replacing specific patterns in a string until no pattern is found. A given string like "S(S)(S(S))(S(Z))(A)(0)" is called an L_expression and the rules for replacement are:

  • $$A(u)\longrightarrow u+1$$
  • $$Z(u)(v)\longrightarrow v$$
  • $$S(u)(v)(w)\longrightarrow v(u(v)(w))$$

At first I thought regex would do the job, and it did for the two given examples, "S(Z)(A)(0)" and "S(S)(S(S))(S(Z))(A)(0)". But for the challenge itself, it turned out the iterations are far too many. So in the end, I wrote a C program and tried to make it as fast and optimized as possible:

#include <string.h>
#include <stdio.h>

static int find_patterns(const char* input, char* output, size_t size)
{
    size_t last_x = 0, last_y = 0;
    for (size_t p, i = 0; (p = i) < size; ++i)
    {
        char c = input[p++], next = input[p++];
        if ((c < 'A' || next != '(') && (c != '+' || next < '0' || next > '9'))
        {
            continue; // look for either "A(", "S(", "Z(", "+d"
        }
        if (c == 'S')
        {
            int depth = 1, args = 0, lu = 0, lv = 0, lw = 0;
            char *u = input + p, *v = NULL, *w = NULL, *y = output + last_y;
            do {
                switch (input[p])
                {
                case '(':
                    ++depth;
                    break;
                case ')':
                    if (--depth > 0) break;
                    if (depth < 0 || (args < 2 && input[p + 1] != '(')) {
                        p = size; // exit the loop
                        break;
                    }
                    switch (++args)
                    {
                    case 1:
                        lu = p - i - 2;
                        v = input + p + 2;
                        break;
                    case 2:
                        lv = p - i - 4 - lu;
                        w = input + p + 2;
                        break;
                    case 3: // found pattern S(..)(..)(..)
                        lw = p - i - 6 - lu - lv;
                        memcpy(y, input + last_x, i - last_x);
                        y += i - last_x;
                        memcpy(y, v, lv); // replace with v(u(v)(w))
                        y += lv;
                        memcpy(y, u - 1, lu + 1);
                        y += lu + 1;
                        memcpy(y, v - 1, lv + lw + 4);
                        y += lv + lw + 5;
                        y[-1] = ')';
                        last_x = p + 1;
                        last_y = y - output;
                        i = p;
                        p = size;
                    }
                    break;
                }
            } while (p++ < size);
        }
        else if (c == 'Z')
        {
            int depth = 1, args = 0, lu = 0, lv = 0;
            char *u = input + p, *v = NULL, *y = output + last_y;
            do {
                switch (input[p])
                {
                case '(':
                    ++depth;
                    break;
                case ')':
                    if (--depth > 0) break;
                    if (depth < 0 || (!args && input[p + 1] != '(')) {
                        p = size; // exit the loop
                        break;
                    }
                    switch (++args)
                    {
                    case 1:
                        lu = p - i - 2;
                        v = input + p + 2;
                        break;
                    case 2: // found pattern Z(..)(v)
                        lv = p - i - 4 - lu;
                        memcpy(y, input + last_x, i - last_x);
                        y += i - last_x;
                        memcpy(y, v, lv); // replace with v
                        y += lv;
                        last_x = p + 1;
                        last_y = y - output;
                        i = p;
                        p = size;
                    }
                    break;
                }
            } while (p++ < size);
        }
        else if (c == 'A')
        {
            int depth = 1, lu = 0;
            char *u = input + p, *y = output + last_y;
            do {
                switch (input[p])
                {
                case '(':
                    ++depth;
                    break;
                case ')':
                    if (--depth > 0) break;
                    if (depth < 0) {
                        p = size; // exit the loop
                        break;
                    }
                    lu = p - i - 2; // found pattern A(u)
                    memcpy(y, input + last_x, i - last_x);
                    y += i - last_x;
                    memcpy(y, u, lu); // replace with u+1
                    y += lu + 2;
                    y[-2] = '+';
                    y[-1] = '1';
                    last_x = p + 1;
                    last_y = y - output;
                    i = p;
                    p = size;
                    break;
                }
            } while (p++ < size);
        }
        else // c == '+': look for u+v where both are numbers
        {
            size_t u = 0, v = 0, d = 1, lu;
            for (p = i; p--; d *= 10) {
                c = input[p] - '0';
                if (c >= 0 && c < 10) u += d * c;
                else break;
            }
            for (lu = i - p - 1, p = i + 1; 1; ++p) {
                c = input[p] - '0';
                if (c >= 0 && c < 10) v = v * 10 + c;
                else break;
            }
            if (lu && v) // both u and v are non-negative numbers
            {
                char str[20], *y = output + last_y;
                for (u += v, d = 0; u; u /= 10) { // convert u+v to string
                    str[d++] = u % 10 + '0';
                }
                for (str[d--] = v = 0; v < d; ) {
                    u = str[v];
                    str[v++] = str[d];
                    str[d--] = u;
                }
                memcpy(y, input + last_x, i - last_x - lu);
                y += i - last_x - lu;
                memcpy(y, str, lu = strlen(str));
                y += lu;
                last_x = p;
                last_y = y - output;
                i = p - 1;
            }
        }
    }
    if (last_x < size) {
        memcpy(output + last_y, input + last_x, size - last_x);
    }
    output[size - last_x + last_y] = 0;
    return last_x > 0;
}

#define MAX_ITERATIONS 1E6 // (size_t)(-2)

int main()
{
    const char
        *test1 = "S(Z)(A)(0)",
        *test2 = "S(S)(S(S))(S(Z))(A)(0)",
        *test3 = "S(S)(S(S))(S(S))(S(Z))(A)(0)";
    char str[2][0x5000] = { { 0 } };
    size_t iterations, i, max_size = 0;
    strcpy(*str, test3);

    for (iterations = i = 0; ++iterations < MAX_ITERATIONS; i = 1 - i)
    {
        size_t size = strlen(str[i]);
        if (size > max_size) {
            max_size = size;
        }
        if (!find_patterns(str[i], str[1 - i], size)) {
            break;
        }
    }
    printf("result of L-actions:\n%s\nmax size: %d, #iterations: %d\n",
        str[1 - i], max_size, iterations);
    return 0;
}

The problem is, the code still takes ages to complete like 10^8 iterations and even so, we still won't reach anywhere near the solution. I have tried configuring gcc/clang flags for speed optimizations and don't know what else I should do.

\$\endgroup\$
1
  • 2
    \$\begingroup\$ For most Project Euler problems, a “straight forward” algorithm is far too slow, no matter how optimized the implementation is. The challenge of those problems is to find a better algorithm. \$\endgroup\$
    – Martin R
    Commented yesterday

2 Answers 2

9
\$\begingroup\$

Simplify error checking

        if ((c < 'A' || next != '(') && (c != '+' || next < '0' || next > '9'))
        {
            continue; // look for either "A(", "S(", "Z(", "+d"
        }

This doesn't help you. You're doing a bunch of extra checks for invalid input. If you found it, you shouldn't continue, you should exit. However, this doesn't check for all invalid input, e.g. B, C, [, or a. An easier way would be to remove that block and replace

        else // c == '+': look for u+v where both are numbers

with

        else if (c == '+')

and add a new clause at the end

        else 
        {
            // this should never happen,
            // the current character wasn't A, S, Z, or +
            exit(-1);
        }

Then you'd check if c was A, S, Z, or +. If not, you'd stop. Either the input was malformed or you processed it incorrectly. Either way, no reason to continue.

You'd do something similar for next, which must be ( for A, S, or Z and a digit for +.

This would be better for two reasons:

  1. It checks for all invalid values of c.
  2. It checks for the valid values first.

Your code does two checks every time before it starts looking for valid values. However, invalid values should not happen with proper input and processing. Move that check last rather than doing it first. You'll save one to two comparisons each iteration of that loop in the happy path.

Write before you unroll

One of the reasons why this is hard to read is that you unrolled your helper functions into the code you're showing us. Consider the following

switch (input[i]) {
    case 'A':
        verify('(', input[i + 1]);
        int end = find_matching_closing_parenthesis(input, i + 1);
        replace(i, end, i + 2, end - 1, '+1');
        break;

That would be much easier for us to read than what you have, although presumably it's what you're actually doing.

Also, switch statements may be optimized more than an if/else. They can be implemented with a jump table which does fewer calculations to determine the next line of code to run.

Write that code first. Get it working. Then, if you need slightly faster code, you can unroll those function calls. However, please realize that the compiler can also inline function calls. It's not at all obvious that it will be faster to unroll them in the code. It might be. We don't know what the compiler is doing. But this is something that you should absolutely profile before you try to optimize. This is also true of my switch suggestion. Profile it both ways.

You're doing micro-optimizations, but you're not showing us that they're helping. It's not strange for the compiler optimizations to be superior to your manual optimizations. That's why you profile, to tell the difference.

Consider tokenization

            size_t u = 0, v = 0, d = 1, lu;
            for (p = i; p--; d *= 10) {
                c = input[p] - '0';
                if (c >= 0 && c < 10) u += d * c;
                else break;
            }
            for (lu = i - p - 1, p = i + 1; 1; ++p) {
                c = input[p] - '0';
                if (c >= 0 && c < 10) v = v * 10 + c;
                else break;
            }
            if (lu && v) // both u and v are non-negative numbers
            {
                char str[20], *y = output + last_y;
                for (u += v, d = 0; u; u /= 10) { // convert u+v to string
                    str[d++] = u % 10 + '0';

Here, you convert strings to numbers. And then you convert numbers to strings. Why? You want to work with numbers (why you convert the input string to numbers). There is no reason to change the numbers back to strings until you're ready to output them. You have to input as strings, but you don't have to convert back to strings while processing.

Instead, tokenize the entire thing. Convert all the numbers to integer values mod \$10^9\$ (you are supposed to be returning the answer mod \$10^9\$). Operate on them that way. Only convert back to a string at the very end. Use printf. Again, profile that, but I would expect the built-in to have optimizations over your version. Note that \$10^9\$ fits in thirty bits, so one thirty-one bit signed (or thirty-two bit unsigned, since there's no subtraction) integer is sufficient.

It might make sense to get rid of the parentheses as part of the tokenization and read into a tree structure instead.

Micro-optimizing

Also

                c = input[p] - '0';
                if (c >= 0 && c < 10) u += d * c;
                else break;

That might be faster as

                if (input[p] < '0' || input[p] > '9') {
                    break;
                }
                u += d * (input[p] - '0');

If not faster, it's more readable.

There's also a built-in isdigit that would tell you if it's a digit. Again, possibly faster. Profile.

Don't brute force Euler

Your solution is basically about iterating as fast as you can. However, Euler is almost always about making observations about how to short cut calculations.

This is a branch of math called combinator calculus (not to be confused with combinatorics or combinatorials). You'll also see it referred to as SKI. SKI defines three functions: a substitution S (corresponds to S in the Euler problem); a combinator K (corresponds to Z in the Euler problem); an identity I, which can also be written SK.

The main characteristic of I is that it takes its argument and returns it. So I(x) is just x. The Euler problem doesn't define I. You are supposed to compose it. Then you scan the input for it. Rather than iterating multiple times, each time you find I in the input, you can replace it with its argument. That reduces your iterations.

The problem is likely to find the set of optimizations like this that you can check. Rather than iterating to the solution, they expect you to substitute to the solution.

Project Euler wants you to make mathematical insights, not computing insights. If you start veering off into optimizing computation, you're probably on the wrong track.

I'm not familiar with this branch of mathematics, so I can't help you with more specific insight. I'm simply confident that computation is not the problem here. Understanding how the math works is. I only made suggestions for your computation to help with other problems. This one probably needs a better algorithm, not better computation.

\$\endgroup\$
1
  • \$\begingroup\$ Thank you, those are truly valuable advices. Though I just figured out how to dramatically reduce the number of iterations. I'll add an answer later. \$\endgroup\$
    – polfosol
    Commented yesterday
7
\$\begingroup\$

I took a look over your code and my main problem is that I fail to understand what your code does, thus I cannot give hints on the performance problem.
I am also not familiar with this particular challenge, so my challenge related answers may not fit to the nature of the problem.

  • It is difficult to identify what the variables stand for and thus what they should do.
  • The many nesting levels make it difficult to see what starts and ends where.
  • The big and multiple nested loops (together with the prior two problems) make it difficult to identify how the string is processed in which order.
  • The multiple definitions and initializations in the same code line can be quite confusing while this brings no benefit to the functionality or performance.

I do give credit that you implemented an iterative approach instead of a recursive one, which is probably a good approach even though I don't yet see if the recursion depth would be a problem or not.

  • Add more comments about the many not obvious parts (e.g. which characters fall into the category c < 'A', does that include or exclude the numbers and brackets?; where does lw = p - i - 6 - lu - lv point to and so on)
  • As the entire code is in the same module it is quite easy for the compiler to do some obvious optimizations like function inlining. I recommend to put the three different replacement rules into separate functions to improve readability and let the compiler put it back together.
  • I recommend to give all the single-character variable names a more meaningful name unless this name is part of the general challenge description (like u, v and w)

So I am aware that my answer does not answer your direct question, but following my proposals might help others to help with your performance question.
Don't underestimate the benefits of maintainable code. Reading and understanding code is a part of maintaining.

\$\endgroup\$
1
  • 2
    \$\begingroup\$ I know you are embarrassed that you can't offer more advice, but I genuinely believe that "this code is hard to understand" is extremely valuable feedback (even though it hurts when I receive it!). If polfosol takes that criticism and posts a new question with named functions, improved identifiers and (where all else fails) explanatory comments, then a more informed review is likely to be the result. \$\endgroup\$ Commented yesterday

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.