Ludwig - On Mathematics vs Programming, and Syntax and Semantics

This comes from a DM I received on x.com from Thrilla, who sent me a couple of brilliant questions. After making him wait for a couple of weeks as I dragged to answer them, I ended up recording myself answering them out loud and used SuperWhisper to capture the transcript. My girlfriend joined my monologue a couple times, which I have indicated clearly when it happens.

To what extent can programming be treated like mathematics a fully formal system with precise semantics?

Well, I mean, first of all, I don’t know that mathematics has a fully precise semantics, right? I think you would have to go down to ZFC, you know, Zermelo-Fraenkel set theory with the axiom of choice, or some other extension on your axioms or something like that. I don’t think semantics are completely precise. I think that it’s very abstract and rarely compiled down to the set theoretic roots.

So that’s one thing: yes, mathematics has a fully formal symbolic system and very clear rules of logical deduction and stuff like that. But I don’t think it means that it’s fully precise in semantics.

For programming, I mean, it’s similar in that the formal symbolic way to write down programs is definitely precise syntactically, but precise semantics? It’s hard to formally verify that. And then you get into things like hardware and physical reality or implementations of these programs running on some CPU and some architecture, and having a compiler in between, and faulty transistors and chip defects and billions of lines of code in between your code and the code that will actually run.

Maybe there’s a note to make on how semantics arise from syntax. That would be interesting. And another note on programs as proofs and proofs as programs.

There is also maybe something to note on math versus physics, where pure mathematics is closer to having precise semantics but applied mathematics, especially the mathematics of physics, has much less clear semantics as physical reality is a very complex system of many complex systems.

Where do the constraints imposed by hardware, memory models, and machine execution break the analogy?

They break the analogy in the same places that the equivalent analogies break in the real world when modeling physical systems with mathematics, as we do not have formal and 100% representative models of every physical system and at best have a series of approximations that allow us to make some predictions. This is very similar to the software-hardware contract in that the overwhelming majority of hardware is not formally understood, but rather is built on tons of abstractions stacked on top of each other that give us a solid sense of “what will happen” and the various modes of failures, which ensures a soft set of assumptions we can use to predict the behavior of programs.

How should this distinction shape the way we learn programming versus learning mathematics?

For pure mathematics, yes, you have precise notation and you also have precise meaning because it’s abstract. We know everything about this abstract system because it’s “made up”. We’ve built it in a certain way that we can fully understand it, and it isn’t “dirtied” by the complexities that would be invited if it was to be physically constructed.

Most of the mathematics in the real world is trying to model systems that are substantially more complex. There’s way more going on in them, right? They’re orders of magnitude more complex, simply because there are many more systems composing the systems we observe or are interested in. The line where we decide to draw at what is a system is very fine. And so there the semantics are not fully precise. We don’t know exactly what they mean. The mathematics that we write to describe the real world and make predictions about it, like if I throw this ball and it moves with that amount of kinetic energy, then it will accelerate for that amount of time, then start decelerating and gravity will be applied in that way, all that stuff. They’re like encapsulations, they’re like pointers to some other real thing that we don’t really have, right? We don’t actually have it. We just have these models or approximations of these physical things that we see, but the math is not “the thing in itself” that we see either. Do you know what I mean? When you write down the constants of gravity, that’s not what gravity is. It’s just a model that we can plug into equations to predict how gravity would behave. And so that’s where mathematics also lacks “precise semantics”, in a way.

The biggest correlation between programming and mathematics, or more specifically the mathematics of physics, is that anything that’s implemented in a real physical substrate is never going to have precise semantics because there is some sense of computational irreducibility or too large of a complexity for us to formally understand every single moving part of the system. We just have these loose little predictive models that we try to put on it to be able to navigate this complex space and, we build things by stacking abstractions in a way, because there is too much complexity for us to not use abstractions.

And in programming, it’s kind of the same. You start with electrons and silicon wafers and you have to do this whole process of lithography. And then you have chip design, microarchitecture, and then the ISA, which is this abstract specification of what the chip promises to do. Then there are tons of abstractions on top: the kernel, system calls, the runtime, the compiler. And when your code runs, it’s actually executing directly on the CPU, these machine instructions firing on the hardware, but you never think about it that way. You think you’re calling a function or adding two numbers. And the “CPU” you have in your head is really this abstract concept, when actually there’s a physical chip with transistors switching on and off billions of times a second, beaming electrons around, reading and writing to memory, which is itself just capacitors holding tiny amounts of charge. And so the point is, this is already on its own a very complex system, right.

Because programs ends up running on physical systems, the semantics are not fully precise. And the mathematics of physics also runs on the real system, the physical world, and so it’s not fully precise because we just don’t have the full understanding, top-down, of everything.

It’s potentially possible in programming or computer science that you would have a full formal model of everything underneath, down to the lowest level of abstraction that we could formally define, which would be right above the electrons and electricity, I guess. And you could have a chip CPU that’s completely formally defined, and you could have an operating system and everything in between that is completely formally verified. And so you would increase that semantic precision quite a bit in that you would have much stronger expectations of what is possible to happen. But even then, because it’s physically implemented, there’s some layer where, if you keep going through the system, it’s not going to be fully understood. Like quantum mechanics, we could never formally define in the way that we know exactly what the string of symbols that we’re putting down is going to actually do, where we fully understand the semantics, the potential meaning of what this does. We will never fully understand the entire range of semantics possible coming from the symbols.

And in pure mathematics, the only reason why you do that is because even though the objects can be infinite dimensional Hilbert spaces, they are still not physical. And so they are pure in that way. In a way they are like, infinitely complex, but also all of their complexity comes from completely clear axioms. There is less complexity because in the pure space of mathematics, mathematical structures are not made of electrons. They’re not made of quarks. They don’t obey gravitational forces. They don’t have mass. They’re not made of cells, they’re not made of any biological matter, and so this removes a lot of the complexity. You know what I mean?

I would make a distinction once more about pure and abstract systems and systems which might be physically implemented, and the fact that as soon as a system is physically implemented, it is never going to be fully understood. However, you can build your abstractions in such a way, much like we did for computers, so that you have a very fine understanding of it in order to be able to rely on it in the physical world. Computers were built like this, and the hardware-software contract might not be perfect but it is “good enough” in such a way that we are ensured a certain amount of semantics, which is substantial enough for us to operate with them daily in many ways. Kinda like Newtonian physics is a good enough contract, so to speak.

As far as how all of this shapes the way one would learn programming vs mathematics, I think actually they overlap a lot. They are both fully formal symbolic systems that, when physically implemented, might not have fully understood semantics. They are both ways to approximate, sometimes up to any degree of accuracy or as many as necessary, the behaviors that will be observed if the computations laid out were executed.

Approximate is a badly connoted word for people because they think approximations are inherently imprecise, right? But you could approximate something to any desired degree of accuracy. A good example is pi. 3.14 is an approximation of the real value of pi. Now we could compute one more digit, that would be 1, and another, 5, 9, 2 etc… But the point is, pi is computationally approximated. If we needed more accuracy for the digits of pi, we could keep going. I could print a billion digits of pi, I could print a hundred billion digits of pi. As a matter of fact, we could get all the computers in the world to print digits of pi for the rest of humanity, or until the heat death of the universe, and we would keep approximating it better and better and better.

And so the idea of approximation is not just that it’s this loose vibe of like, okay, well, I think it’s going to go like that. Some things you can even approximate to any degree of accuracy that you would wish. And this is very often done in statistics and probabilities and things like that, where if you have a poll and it gives you a certain degree of confidence, what you could do is increase the sample size and you can get to any arbitrary tight bound you wish, but you won’t know the answer to whatever political election until it happens.

For Pi, you will never reach the actual value, it does not converge to some “true value”. Pi is irrational, it’s infinite. It’s proven that there’s an infinite amount of digits, but certainly we can keep computing the next digit. And so that, in a way, is very different from what I think people hear when I say approximation. People think it’s imperfect in such a way that it’s very likely that it could be flawed or something, whereas instead what I’m saying is that it can be so accurate that it doesn’t really matter whether we know or not.

My girlfriend hears me yapping to myself and asks if Pi is used in equations

Yes, many many.

So all of these equations are not fully accurate?

Yes, all of these equations are not fully accurate, you’re right. They use approximations, I mean they use up to n digits of pi. Lots of things in planes uses pi in their equations all the time for example.

And it still works?

Yeah, because in a way, the more complex the system is, the harder it is to fully represent, but also in a way, the more leeway there is for being inaccurate.

This is why when a sniper is sitting on a roof and is aiming at a target 800 meters away, and we take the wind into account, we take the humidity, we take the exact elevation of the earth to get a sense of the atmospheric pressure, or things like that, and then we take the velocity of the bullet and we take the weight of the bullet, we take also its composition for how it will affect drag, and the target is not moving. We can actually compute a trajectory of the bullet for us, but we are not computing the exact trajectory of the bullet. We actually have no idea. Because the wind, when we say 17 kilometers per hour, is an approximation of the actual wind. And the constant we used for gravity, is an approximation of the actual gravitational forces that’ll be applied. And so on.

And yet, snipers can make shots a kilometer away, and we basically know where the bullet will fall, right? And that’s because the world is so complex in a way that just approximations to some degree of accuracy are way enough because it’s so infinitely complex. Yes, you could keep going and maybe we would know to the nanometer exactly where the bullet might land, but it doesn’t really matter to us because we’re quite large in large 3D environments where differences are going to be very mild, and a few variables weight way more than others, like gravity. But as we can observe, we just don’t need that level of precision.

And so this is actually a great question that you asked because it goes back to the difference between, well, how programming feels very different than mathematics because programs end up running on these imperfect machines and mathematics is this perfect abstract thing. But two things: again, a lot of mathematics is very much about describing the real world, so certainly once something is implemented in the world, any mathematics made to describe it is potentially imprecise or just approximative in nature. And the second thing is, maybe reality doesn’t need fully formalized understanding of the semantics. Maybe you just need enough that you can operate. The more complex the system is, the more you can get away with using approximations. So that’s one thing.

How then would you say learning math helps you become a better programmer? Would you say the value in learning math as a programmer resides in it’s application rather than the way we fundamentally think about concepts in both fields?

I mean, how would you say learning math helps you become a better programmer? I think math helps you become a better thinker overall. That’s what I think. That’s really it. Yeah, it doesn’t really… I mean, mathematics… Computer science is a subset of mathematics. It’s a bunch of mathematics. It’s a little bit like the difference between being a material scientist, finding new chemical recipes for making concrete and knowing how to lay concrete.

my girlfriend says: So you don’t need to know quantum physics to make a website.

Yeah, yeah, yeah, yeah. To make a good website. Yeah, but such a way that the distinction is that, mathematics is the bedrock for that stuff and a lot of other stuff, and there are some guys who are working on the bedrock, so to say, and others laying foundations on top of that bedrock. Obviously you need to know what the bedrock is and some of its properties, but you don’t need to know everything about the bedrock. You don’t need to know the mineral composition of the bedrock or whatever.

In the case of somebody working on making new concrete mixtures, that are going to allow for bigger buildings or things like that, then an architect doesn’t really have the same job. And then the guys who are going to lay the concrete and make the bricks and build the building, they also don’t need to know that.

However, it’s not quite… Mathematics fundamentally is very close to programming, at the same time, because in programming you are writing down some set of symbols in some finite alphabet with some clear grammar. And from there you will run computations in that this code will be compiled using some rules and it will produce outputs. And so that’s fundamentally mathematical.

And there’s some, you know, some of the things that we’re doing as humans, there are some overall general senses of how to reason that way, I think, that you do get from both. However, you don’t need mathematics for that, in that even the things that you do in mathematics for this is not mathematics either. It’s overall human reasoning, I think, right? Which has many, many other avenues. It’s just that mathematics is obviously the most prevalent one.

I think, looking back at the question, there’s… it depends on your programmer job. It depends on what you want to do. You know, if you’re going to become a high frequency trader or a quantitative analyst, you’re going to need to know a lot about statistics and probabilities. And stochastic calculus and things like that. So you would need mathematics because in the programming you’re doing, you are building statistical models to predict the movement of money on markets, right? So then you would need some parts of some mathematical knowledge.

If you’re going to work as a video game developer working on video game engines, you’re going to need quaternions and physical concepts like gravity and momentum. And you’re going to need to be able to rotate objects in 3D spaces for which you need Four-dimensional weird complex number systems to be able to rotate things in 3D space. So there you would need linear algebra or things like that.

If you’re working machine learning, you’re going to need linear algebra because we’re working in vector spaces of n dimensions and we’re applying linear transformations to matrices and doing things.

So in some domains, you need some fields of mathematics to be able to do the domain. That’s for sure.

Overall though, even if you’re a web programmer working on HTML and JavaScript, There are few other fields that will teach you the systematic way of thinking and decomposing problems than mathematics does. It’s going to be harder to get it somewhere else. You know, maybe you can get it from puzzle video games, maybe you can get it from this and that. But in mathematics, you’re consistently having to decompose your problem. You know, you get given a problem and then, you know, maybe the problem is too hard. So you ask yourself, is there a smaller version of that problem I can fix, right, that I can solve. Can I put this problem in some other words, in my own words, right? Can I use this trick on it? What if I took an example of that problem? So the problem is generalized, and I’m going to plug some values in, right, just to see what would happen. What are the properties and invariants that pop out?

And so this kind of fundamental ways of thinking, mathematics just constantly make you do. And I think that’s very useful for programming, because there’s a much shorter bridge between that and this than, you know, material science and building, being a construction worker. So that’s one thing. Fundamentally, you know, the concepts in both fields do apply in the abstract reasoning way.

There is also obviously a large field of mathematics that includes computer science that is full of discrete mathematics and combinatorics. Algorithms are mathematics, not just computer science. You know, very often in programming, maybe not so much in frontend, but again, there are many fields where you might be programming and you might need to apply algorithms and you certainly get a more abstract and general sense of the nature of the computation, the time and resources that it might cost, by formally thinking about them. It’s not all you need, because you need to know the hardware architecture, the network topology etc but again, programming is also kind of wide in that you kind of go back to the construction worker. There’s so much of programming that can be done purely from being on the construction site, hanging out with the guys and being told to lay the bricks like that and using empirical evidence, which is your Learned experience from laying bricks and working on the field where you pick up the habits and you look around and you’re like, “Oh shit, they do it like that.” There’s a lot of that too. It really depends on the kind of programming that you want to do.

It’s a little bit like if there was one word to describe all the positions in medicine. There’s many kinds of positions in medicine and if you’re a brain surgeon versus an orthopedic surgeon vs the guy who concocts anesthesia mixtures, it’s very different domains and so you might not require the same skills. “Programmer” is a little like that. There’s a lot of different types of programmers. So yeah, certainly I think math can only help you become a better programmer. Maybe not always with the actual math you learn, but in the ways you learn to reason.

I’m trying to understand whether the role of notation and formal definitions in math has a direct counterpart in programming, or whether programming fundamentally requires a different kind of reasoning because its “truths” depend on physical execution rather than axiomatic rules.

Okay. His last question is, “I’m trying to understand whether the role of notation and formal definitions in math has a direct counterparts in programming or whether programming fundamentally requires a different kind of reasoning because its truths depend on physical executions rather than axiomatic rules.”

Yeah. So I think I answered that. Um, I think I answered that already in the way that I made some clear distinctions between mathematics about pure and abstract things versus applied mathematics that describe the real world. We talked about things like the fact that any system that is physically implemented is substantially more complex and we have some kind of restriction on how much we can observe and understand the full system. Because of these restrictions, we are forced to build models and approximations of these systems. Some of these models and approximations we can spend more compute to make more accurate. And so they can kind of either converge to the real thing or be made to be arbitrarily accurate, right? Like they’re so close to it that they’re indiscernible and that’s good enough.

I made an argument that computers fundamentally for the last 75 years have been built like this, from the transistors all the way up to the Python interpreter. They’re very similar to that. They’re stacked on top of each other with very neat abstractions that allows us to like, you know, have very, very, very strong beliefs about what we’re doing and what will happen as the behavior of programs are somewhat predictable. So that’s one thing.

Now the role of notation and formal definitions in mathematics. I mean, there is programming language theory, which does a lot of that. There’s type theory, homotopy type theory. There’s a lot of formal definitions in what can happen to types and type checkers and type systems. There is the Curry-Howard correspondence, which showcases that type-checked programs are proofs of their type signature. And these are very precise, but then obviously their outcomes are physically implemented and so they get “less precise” in a way.

But again, the contract between software and hardware makes it so that there is this contract, there is this real faith and trust between the software layer that says, hey, here’s my program, it has these symbols, so it must do this. And then the hardware, which is like, hey, we understand these things. And when you tell us this, we will try to do as close to that as we can, and we agree on what “as close” as to be. That makes them very close to each other.

This is only in the world where we distinguish the kind of mathematics, the ones that are about physical things and the ones that are not. In the world of pure mathematics, not only is the notation and definition formal, but the semantics of it can be fully formalized. Because they are not dirtied by having to be in the real world, And so when we say that the dodecahedron has a symmetric group of 60, or 120 with reflections, and that we can do this to it and there’s that many orbits and stabilizers and we can do this and that and yeah, it is this ideal decahedron. It’s not a real decahedron. Now, the good news is with this mathematical object, is that we can do this. There are some things that transcribe perfectly, for example: if I give you a decahedron, which can be made physical, it will have exactly these amount of symmetries or something like that. But the thing is now the decahedron is made of things. And these things are also all very complex. And these things are also made of things, and so on. And then at some point we just can’t even zoom in that much. And so now we just have these loose things and systems we don’t really understand. But our models are good enough if I flip the decahedron, if I throw it around, I know things about it in the way that I can know what it will do, to some level of accuracy. That’s enough for us to operate with it. But when you’re working on the symmetries of the decahedron on paper, it is perfect. And there’s no embodiment or physical representation of it, and that simplifies things. So yeah, that’s one thing. I think that’s good enough.