My Mathematical Regression

(blog.dahl.dev)

103 points | by aleda145 3 days ago

18 comments

  • jp57 1 hour ago
    I think one of the saddest thing is that the kind of person who would recognize, "we can solve this seemingly complicated problem by just applying this formula", would often have trouble even getting recognized in many corporate environments.

    I managed a guy like that. He was capable of very complex thinking, but he wasn't in love with complexity, he was in love with simplicity. His solutions tended to be of the form, "we can ignore all these things, and just focus on X, and it will provide all the value." He'd notice something and simplify it and the benefit to the company would be measured in multiples of his salary.

    Every manager who'd ever directly managed him knew what a treasure he was, but it was often hard for us to convince others of the value of his solutions because they were so simple, and people were convinced that hard problems must have complex solutions. (or else they would have solved them, right?)

    He eventually got bored. He retired and joined a seminary.

    • xg15 1 hour ago
      I imagine this is where the reputation of a good manager comes in and the ability to say to their boss "hey, we should keep this guy... just trust me on this."
  • mb7733 2 hours ago
    An intuitive motivation for the solution in the article (2n choose n). For an n*n grid you have to you will take 2n steps, n "over" and n "down". All that matters is the order of the steps. So if you think of there being 2n "slots", you have to pick n to be "over", and the rest are forced to be "down". So it's n choose 2n indeed.

    You can also think of it another way, without using the formula combinations, and only the fact that there are n! permutations of n objects. We can think of this a permutation of 2n items, made up of two groups of n identical items each. Using (2n!) will overcount, due to the fact that each of the "over" steps are identical, and similarly for the "down" group. We have cut down our answer by dividing out all of the repeated sequences. There will be n! redundancies for all the ways we can permute the "over" group and, the same for the "down" group. So this results in (2n!) / (n! * n!), which is exactly equal to 2n choose n. See [1] which explains permutations with repetion this in general. [Note: We pretty much re-derived the formula for combinations!]

    [1] https://brilliant.org/wiki/permutations-with-repetition/

  • kccqzy 3 minutes ago
    I was sad in a different way. I immediately realized that this could be solved by dynamic programming by computing the recurrence F(x,y)=F(x-1,y)+F(x,y-1) with the base case F(0,0)=1 and F(x,y)=0 if x<0 or y<0. The problem is that I immediately jumped to generating functions as a tool to solve this. I defined G(u,v)=\sum_x \sum_y F(x,y) u^x v^y. After maybe ten minutes of manipulation I arrived at the closed form for G(u,v)=1/(1-u-v). At this point I recognized its series expansion and its coefficients are just given by the binomial function.

    I feel sad because I had forgotten the simple and intuitive construction of choosing “go down” and “go right” directions. When a person learns more advanced mathematics, it is often the case that the person just applies such advanced mathematics by rote without realizing that a solution can be found with more elementary mathematics and more creativity. It reminded me of the time in middle school before derivatives were taught, when my teacher reminded me that using derivatives to solve a problem would receive no credit.

  • andredurao 4 minutes ago
    That Project Euler problem was my first encounter with memoization. At the time, it felt like a magical solution, so I ended up solving it using the central column of Pascal's triangle, which was easier for me to understand back then.

    I also tried a weird idea involving popcount, but it didn't scale. My approach was to represent each possible path with 0s (don't turn) and 1s (turn), testing the same number of 0s and 1s. However, even with popcount running in O(1) with hardware support, the total number of possible paths made the idea impractical :)

  • dhosek 41 minutes ago
    There’s a bit of hand-waving in the jump to 2n choose n solution, which I suppose is fine, and my ex–math teacher brain really wants to have a proper proof or at least solid reasoning rather than “it follows the pattern” based on three observations.

    But I am reminded of how during my engagement 24 years ago, my future father-in-law raised an issue of being able to determine whether they were getting the full amount of sandpaper on large rolls that they were paying for. I was able to simplify the question a bit to one that treated the rolls as if they were simple concentric rolls of a specified thickness and from there could turn it into the good old Gaussian sum formula times 2π to get the length. The engineers working for the company came up with the same solution, but instead of using n(n-1)/2 they did the summation with multiple rows in excel.

    • shenberg 0 minutes ago
      You always either go left or down so total 40 steps, choose 20 to be down (or 20 to be right)
  • the_red_mist 1 hour ago
    Tbh your student reasoning is still dangerous... the patterns could have not generalized nicely. see moser's circle problem

    needed to justify viewing this as "arranging down vs right movements" as another comment outlines

  • d_silin 48 minutes ago
    Nobody forces you to use AI.

    It has become sort of junk food for the brain. Temptations and ads for it everywhere.

    • bluefirebrand 41 minutes ago
      If you have performance based metrics about your AI usage then you are essentially being forced to use AI (or become unemployed)

      Plenty of people are experiencing this nowadays

      The idea that no one is being forced to use AI is nonsense

      • pocksuppet 19 minutes ago
        That was in the first calendar quarter this year. In the second calendar quarter CEOs started saying AI is too fucking expensive and let's stop doing it.
      • d_silin 7 minutes ago
        There is no metric a reasonably intelligent person can't sabotage or subvert...

        ...for example, you can write a script to burn tokens and write the code yourself.

    • lezojeda 36 minutes ago
      [dead]
  • purple-leafy 1 hour ago
    Heh, this grid image is all too familiar to me right now.

    I’m building a grid based game and engine, and I have a game replay format which is not video.

    I hit a massive wall with compression, trying to compress unit pathing and was trying to solve a similar solution.

    Given an NxN grid, and the 4 cardinal directions (NSEW) you can move in, plus an extra action that makes you move 2 cells instead of 1, and considering you can move 4 cells per second…

    What’s the smallest worst-case raw compression artefact you can output for 1 player for a 1 minute game?

    It’s an extremely fun problem to solve. I tried:

    - encoding changes into bits eg using 2 bits for direction

    - movement pattern batching (ie batching 2 moves into 3 bits)

    - crowd patterns and movement prediction

    - treating movement as a “projectile” and deriving intermediate states

    And all sorts of other wild crap that I will write up about on game launch

    • tux3 1 hour ago
      What a lot of games do is run a strictly deterministic simulation in lockstep. Then you don't save the path of every unit, you save one move command for the whole group. Then the game replays inputs, and the pathing algorithm should give the same result if there are no desyncs.
      • purple-leafy 1 hour ago
        Yes you are definitely onto something! Love to see more people talking about deterministic games.

        My game is strictly deterministic, so I get bot movement for free - but the player has agency so I need to capture their deviations

        That’s the tricky part! Right now I do capture input (actually just deviations) and can replay whole games, but I think I’m at the limits in terms of compression - talking bytes here not KB

  • krackers 22 minutes ago
    "Compute the first few terms and plug into OEIS" is very high on the reward:effort scale
  • utopiah 39 minutes ago
    Comment as a song https://mariedavidson.bandcamp.com/album/work-it-soulwax-rem...

    There is no easy way out, you have to rest but you simply can't stop. Your body will rot, your mind too.

    PS: song isn't an ode to the grind culture or how to slave away in an office, as lyrics say "you’ve got to work for yourself - Love yourself, feed yourself".

  • floppyd 1 hour ago
    Noticing a pattern and just extending it without proving why it works is not really a solution. You can prove it without really "understanding" it using induction, but that still would be proof, same as just counting on a computer.
  • aesthesia 52 minutes ago
    I've had the same experience looking back at solutions to old problem sets and wondering how I ever came up with them.
  • Mainan_Tagonist 1 hour ago
    The more i think about math these days, the more i see it as a muscle one must constantly train to achieve its potential.

    Give it too long a rest and you have to go back at full blast for weeks on end to hope to ever achieve past performance.

    I am very bad at math and have always been in awe of those who can do it well.

  • cocoto 1 hour ago
    Even if you don’t know or remember the basics of combinatorics you can solve the problem with basic dynamic programming : start with the unit grid and then expend it.
  • rabbitlord 45 minutes ago
    More like mathematical depression.
  • aeve890 2 hours ago
    Ha. When I found that problem I draw the grids and paths from the example, left for a coffee and when came back I just look at the drawings at an angle and thought "well this is just Pascal's triangle". And the solution was obvious.
  • ogogmad 2 hours ago

      me@localhost:~> bc
      d=1; for(i=21; i < 41; i++){d *= i;}; print d; print "\n";
      335367096786357081410764800000
      n = 1; for(i = 1; i < 21; i++){n *= i;}; print n; print "\n";
      2432902008176640000
      d/n;
      137846528820
    
    I couldn't start Python for some reason, so I went 1337 and used BC, which comes preinstalled in every Unix-like OS. BC has a surprising advantage here since 40!/20! cannot be represented as a 64-bit integer since its value exceeds 2^64. That said, BC's stdlib does not provide the factorial function* - so I had to resort to using for-loops instead.

    * - What it does contain is sine, cosine, exponential, log, arctan, and Bessel J (?!?!?!?!)

    • qsort 1 hour ago
      You don't need space for 40!/20!, for example:

        let ans = 1
        for (let i=1; i<21; ++i) {
          ans *= (41 - i)
          ans /= i
        }
      
      The same idea can be trivially tweaked to compute any binomial coefficient without ever storing an integer greater than the final result.
      • ogogmad 1 hour ago
        Good point. But what if `i` does not divide `ans` evenly? I suppose you could use floats and then round.
        • qsort 1 hour ago
          It always divides it evenly, that's why it works.

          After the i-th iteration of the for loop, ans will contain n!/((n-i)!i!) which is exactly \binom{n}{i}, an integer.

          Technically "ans" can grow above the final result in my example, but even that could be fixed if one really wants (e.g. i must divide either ans or n-i, you play a bit with divmod to figure out which division you do first.)

    • aesthesia 55 minutes ago
      Just noting that Python natively handles integers larger than the machine word size since version 2.5, so this would have worked in Python as well.
      • ogogmad 30 minutes ago
        I think BC's features are equivalent to:

        - Python's native integer handling, which already has no size limit.

        - PLUS part of the Decimal module in Python's stdlib: BC's floats are DECIMAL by default, not binary.

        - PLUS an implementation of Bessel's J function, while neglecting Bessel's K.

        - Some features for base conversion using `ibase` and `obase`. So I suppose you can calculate in base 60! Unless, that is, you use POSIX BC, which does not support anything higher than base 16.

  • jasonmp85 1 hour ago
    [dead]