Over the years I've gradually gained understanding and ability with Thereom provers. There's a lot of Computer Science and Math knowledge baked into how they work which made it a slow process of understanding them, but I can do a lot of simple proofs and Coq and now Lean of late.
I started with simple Logic proofs, and implications. And the gradually got into natural number proofs and slammed into some stupid walls. You have to make sure you organize your inductions so the inductive hypothesis remains as general as possible.
I went pretty far through the first Software foundations book, and learned how to construct inductive types, use rewriting and some proof by contradiction techniques.
Eventually, I saw a discussion on the prediction market Manifold about when a Blank Slate AI would be able to prove the Infinitude of Primes.
Conceptually, my thought was, give me your finite list of primes. I'll take the biggest one on the list, take the factorial, and add 1, and now I must have a prime factor of that massive number which cannot be any of the numbers on the list.
I decided to solve the infinitude of primes myself in Coq, mostly starting from scratch and not using a library.
I had already proven stuff about addition and multiplication of natural numbers, so I brought that in, but I ended up writing a lot of proofs about inequalities and subtraction.
Over 6 months, I claw my way forward slowly, building up my base of lemmas. But sometimes I had to go back and add another lemma I needed for a proof.
My first wall was the divmod function, which I recreated the same definition that the Coq standard library used. It basically does both division and multiplication by keeping track of a sort of dual of the remainder and moving one step at a time.
A subtraction based approach has problems with proving termination, which is a topic I slammed into later, so I'll wait to talk about that.
I learned through this whole 6 months that having ways to decompose operations is so important, and it very hard to decompose the divmod, for a long time.
I gradually managed to work my way through, and I made a big advance when I went back to the Standard library for inspiration and found divmod_spec:
le u y -> let (q',u') := divmod x y q u in
plus x (plus (times (S y) q) (sub y u)) = plus (times (S y) q') (sub y u') /\ le u' y.
Setting up an ax + b situation which the results of divmod was great, and also learning that I could "remember" the two values of a divmod call and work with them enabled me to complete the divmod lemmas I thought I would need.
Next was simple the process of defining the product of a list of items, defining the factorial, and then showing that n! + 1 has to have a prime factor greater than n.
Unfortunately, the intuitive argument that n! + 1 has a prime factor greater than n relies on a descending repeated division process. A nightmare for termination.
Instead I tried to define a decompose fuction. Writing it was difficult in itself, and proving it's termination and how to set up the syntax was hard, but running proofs with decompose was a nightmare.
The Program Fixpoint operation I used to define Decompose spewed all kinds of extra proof terms all over my proofs, and they were all completely unfamiliar concepts to me.
I tried to fight this battle, but for a month or two I was stuck.
I was trying to figure out another way to prove the infinitude of primes idea when I came up with this lemma.
If you have a property of the natural numbers, it either is always false, or else there exists some n which is the minimal or first time the property is true.
We can use this with the property of : does u divide a natural number n, to get the smallest divisor of a number. If a divisor of a number can be factored, then it is not the smallest divisor, either of it's factors would be a smaller divisor.
So we can get a prime divisor for any number n by getting the smallest divisor (other than 1). And we can prove that this divisor of n! + 1 cannot be <= n.
There were still a few battles and chains of additional lemmas, but the final proof in Coq was done soon after that.
Months after finishing the proof in Coq, I kept seeing impressive new efforts in formalizing math in Lean. I tried Lean, but I had troubles setting it up in VS code. After trying several times, I managed to get it working.
I decided to translate my proof of the infinitude of primes into Lean, in order to learn that prover. I am interested in exploring Lean's real numbers and analysis capabilities eventually.
I had a lot of stumbles in the process of understanding Lean, and had another issue with the Lean environment crashing before I got it straightened out.
Several lemmas in the Coq version were left out as the standard library already had the same lemmas for me. I used the existing operations for +, * and -, but I'm using the Coq version of divmod, to avoid a ton of rewriting, but I may reconsider for now.
I've completed all the proofs for add, mul, inequalities, and subtraction and now I'm starting on divmod.
Differences from Lean and Coq:
- In many cases in Lean, you don't need to finish the obvious last line of the proof. if you rewrite a term, and you get an obvious equality, or a case where assumption would apply, Lean does it automatically.
- Seems to work for rw and simp mostly.
- In Coq, reflexivitity is stronger than simpl for equalities proofs, in Lean, simp is stronger.
- simp completes almost all proofs that it can in Lean, in Coq you usually have to simp and then apply a tactic to finish the proof.
- No Qed / Defined in Lean, and no end in matches.
- Lean handles operators for + , *, - with no fiddling about with scopes.
- Lean handles numbers without trying to decide on Peano or Binary number representations.
- Lean has a proper Real number type that doesn't require deep study to use.
- Coq has fold in addition to unfold. I just learned a way to re-"fold" in Lean but it wasn't obvious.
- Unfolding / refolding expressions in general seems harder in Lean, although I've learned a few more tools to deal with it.
- if you have a more general statement as an inductive hypothesis, you have to specialize it with a have statement before applying it, in Lean. In Coq it will usually just auto-specialize it.
- Coq uses in, Lean uses at.
- Apply at was only available with the Mathlib, I use that a lot. You can do have H2 : result := by apply blah but that's more tedious. Now that I use mathlib exclusively it's not a problem.
- Lean usually replaces succ n with n + 1, which drove me crazy until I understood that it still could be used in proofs that refer to the successor.
- Lean normal syntax for a proof is def lemma (params : type), and these params are automatically introduced, and need to be reverted if you want your inductive hypothesis to be most useful.
- reverting teams seems pretty easy and it's nice to not have to intro when you don't need to revert.
- Having all the proofs in the Nat scope available with context search is so helpful. This context based search is very helpful in general.
- Massive Math library in Lean is nice to have, and I'll eventually explore more of it once I finish formalizing the infinitude of primes again.
- It appears like termination criteria is easier to work with in Lean. in another project I was toying with I was able to establish well-founded criteria pretty simply.
- I'm not sure if there's a way to rewrite all references of a term, like in Coq's rewrite ?eq
- I hated Lean's use of symbols not on the keyboard at first, but I've found the backslash codes for them to be not bad.
- induction and cases explicitly forcing you to organize proofs better is nice. Instead of just chaotically solving sub-goals without noting it in the proof script.
No comments:
Post a Comment