More

sibrahim · 2026-01-10T01:52:28 1768009948

That's what's covered by the "assuming you have formalized the statement correctly" parenthetical.

Given a formal statement of what you want, Lean can validate that the steps in a (tedious) machine-readable purported proof are valid and imply the result from accepted axioms. This is not AI, but a tiny, well reviewed kernel that only accepts correct formal logic arguments.

So, if you have a formal statement that you've verified to represent what you are interested in by some other means, Lean can tell you whether the proof created by genAI is correct. Basically, there is a nigh infallible checker that won't accept incorrect hallucinations.

mjevans · 2026-01-10T02:07:40 1768010860

I think the question is, how can humans have verification that the problem statement was correctly encoded into that Lean specification?

ndriscoll · 2026-01-10T02:20:16 1768011616

The problem statement is apparently

> Let C>0 and ϵ>0 be sufficiently small. Are there infinitely many integers a,b,n with a≥ϵn and b≥ϵn such that a!b!∣n!(a+b−n)! and a+b>n+Clogn?

Which seems like it's the type of thing you give as a homework problem to state formally in an intro class.

Jweb_Guru · 2026-01-10T17:03:46 1768064626

Yeah people dramatically overestimate the difficulty of getting one's definitions correct for most problems, especially when you are doing an end to end proof rather than just axiomatizing some system. They are still worth looking at carefully, especially for AI-generated proofs where you don't get the immediate feedback that you do as a human when something you expect to be hard goes through easily, but contrary to what seems to be popular belief here they are generally much easier to verify than the corresponding proof (in the case of formally verified software, the corresponding analogy is verifying that the spec is what you want vs. verifying that the program matches the spec; the former is generally much easier).

oh_my_goodness · 2026-01-10T16:58:35 1768064315

Everyone has a different perspective, based on their math background. From the OP's perspective, the formalization of this problem statement was apparently worth talking about. On the other hand, for you it's just a homework problem that belongs in an intro class.

Let's just be generous and try to accept these differences.

ndriscoll · 2026-01-11T12:56:29 1768136189

My comment was exactly about the required math background though. Anyone who's completed an intro to proof class would find that kind of statement easy to formalize because they would have had to write similar statements for homework. That provides some context: everyone who's interested in computer theorem provers probably has some experience with proofs, so formalizing that statement should be easy for them. i.e. (for this kind of problem) it's not really "the hard part" for people who are seriously working on this stuff.

oh_my_goodness · 2026-01-12T13:34:27 1768224867

I think it's reasonable to ask whether LLMs formalize theorems correctly. And I'm not sure that saying "this one is easy" answers that question.

ccppurcell · 2026-01-10T06:02:59 1768024979

Are you an expert? Not gatekeeping here but I have no intuition for what is easy or hard to formalise. A lot of very simply stated graph theoretical results are apparently extremely hard to formalise.

impendia · 2026-01-10T11:31:33 1768044693

> Are you an expert?

I can't speak for ndriscoll, but I am a university math professor with extensive experience teaching these sorts of topics, and I agree with their comment in full.

You are right that some (other) statements are harder to formalize than they look. The Four Color Theorem from graph theory is an example. Generally speaking, discrete math, inequalities, for all/there exists, etc. are all easy to formalize. Anything involving geometry or topology is liable to be harder. For example, the Jordan curve theorem states that "any plane simple closed curve divides the plane into two regions, the interior and the exterior". As anyone who has slogged through an intro topology book knows, statements like this take more work to make precise (and still more to prove).

codeflo · 2026-01-10T09:14:10 1768036450

> apparently

When someone takes the time to explain undergrad-level concepts in a comment, responding with "are you an expert?" is a level of skepticism that's bordering on hostile. The person you're responding to is correct, it's rare that the theorem statement itself is particularly hard to formalize. Whatever you read likely refers to the difficulty of formalizing a proof.

freehorse · 2026-01-10T12:06:55 1768046815

To be fair, the comment did not explain any concept that I can see, or why this statement is simple. It gave the statement and said it was simple to formalise. It does seem simple enough to me (basic arithmetic statement with a few variables and a bunch of quantifiers) but if somebody has no expertise/intuition, I think it is a fair question, without any hostile intent assumed.

IsTom · 2026-01-10T09:58:59 1768039139

> it's rare that the theorem statement itself is particularly hard to formalize

That's very dependent on the problem area. For example there's a gap between high school explanation of central limit theorem and actual formalization of it. And when dealing with turing machines sometimes you'll say that something grows e.g. Omega(n), but what happens is that there's some subsequence of inputs for which it does. Generally for complexity theory plain-language explanations can be very vague, because of how insensitive the theory is to small changes and you need to operate on a higher level of abstraction to have a chance to explain a proof in reasonable time.

zozbot234 · 2026-01-10T12:26:49 1768048009

Yes, if the theorem statement itself is "hard to formalize" even given our current tools, formal foundations etc. for this task, this suggests that the underlying math itself is still half-baked in some sense, and could be improved to better capture the concepts we're interested in. Much of analysis-heavy math is in that boat at present, compared to algebra.

kelipso · 2026-01-10T14:53:01 1768056781

Lol it’s weird seeing high school redditors saying gatekeeping and are you an expert in the same thread as university professors, all talking about the same topic. But I guess that’s HN for you.

phyzome · 2026-01-10T13:26:49 1768051609

I think it was a fine question to ask in the context of a discussion of epistemology.

tetha · 2026-01-10T14:28:25 1768055305

As the sibling comment says, in my experience, the difficulty of formalizing the problem varies greatly between different areas. Probability theory is another notorious field in which modelling a problem correctly can be very subtle and difficult.

On the other hand, many problems in number theory and discrete structures tend to be rather simple to formalize. If you want to take your own look at that, I'd recommend to check out the lean games[1]. I'd say after the natural numbers game, you most likely know enough lean to write that problem down in lean with the "sufficiently small" being the trickiest part.

1: https://adam.math.hhu.de/

dooglius · 2026-01-10T06:49:20 1768027760

I think you may be confusing specification of the problem and the formalization of the proof.

schrodinger · 2026-01-10T18:33:07 1768069987

Voters: please reconsider your ups and downs. I think the “Are you an expert” question triggered a lot of downvotes when it was in fact asked in good faith to judge the person’s perspective of easy and hard.

NetMageSCW · 2026-01-10T20:42:04 1768077724

And I would say there is no way to ask that question in good faith. (Tedious proof by cases left as an exercise for readers.)

The correct question would have been, does anyone else agree with the statement.

In this particular case, the amount knowledge needed (of e.g. Lean language, math and Erdos problems) means any credible statement about the difficulty requires an expert.

ndriscoll · 2026-01-11T13:18:23 1768137503

It doesn't require an expert though. That was kind of my point. If you've taken an intro proof class (so the intro class for math majors/the topic), and if you've fiddled with Lean a bit (e.g. played some of the Natural Numbers Game another commenter linked), you'll know it's easy (source: I did math in my undergrad, and have fiddled with Lean a bit). Honestly I expect intro proof classes will start to be centered around something like Lean soonish if some aren't already incorporating it, and we'll see math majors more explicitly making the connection between program and proof.

Like if someone were incredulous that we could reasonably analyze running time and memory usage of something like merge sort and I said that's a standard example in an intro algorithms course, presumably people would be like "yeah it is".

oh_my_goodness · 2026-01-11T03:08:10 1768100890

"Are you an expert?" is a perfectly respectful question.

johncolanduoni · 2026-01-10T03:30:47 1768015847

They can read the statement, and the definitions that the statement references. If everything it references is in a well-tread part of the Lean library, you can have pretty high confidence in a few minutes of going over the syntax.

saghm · 2026-01-10T07:02:29 1768028549

Isn't that kind of a general problem with proofs, even when they're written by humans? There's nothing stopping someone from accidentally writing their own Lean proof that has slightly different semantics than an English version of the same proof, or even for their English proof to subtly miss something important or make an incorrect logical leap. This seems like a bit of a double standard, although maybe there's nuance here I'm missing.

teiferer · 2026-01-10T08:24:18 1768033458

One nuance you are missing is that the discussion is about formalizing the statement (the theorem), not the proof. The latter is what the article is about, but that doesn't suffice if you can't trust that the statement is also correctly formalized.

These are the two main problems:

1. Formalizing a theorem.

2. Finding a formal proof.

Part 2 is where AI could help as proof search is full of heuristics. That's also how humans find proofs and is one of the main skills of a mathematician. The formal proof can then be machine checked with well known and mature techniques not involving AI.

Part 1 is the part that's missing and will always be hard. It's also the issue with formal verification of programs for which correctness criteria are often very complex and it's easy to mess up the formalization, so that even if you trust the proof, you can't trust that it proves the right thing.

digikata · 2026-01-10T11:40:45 1768045245

To borrow some definitions from Systems engineering for verification and validation, this question is one of validation. Verification is performed by Lean and spec syntax and logic enforcement. But Validation is a question of is if the Lean spec encodes a true representation of the problem statement (was the right thing specced). Validation at highest levels is probably an irreplaceable human activity.

Also, on the verification side - there could also be a window of failure that Lean itself has a hidden bug in it too. And with automated systems that seek correctness, it is slightly elevated that some missed crack of a bug becomes exploited in the dev-check-dev loop run by the AI.

mrtesthah · 2026-01-10T02:17:22 1768011442

They probably need to be able to read and understand the lean language.

aidenn0 · 2026-01-10T04:29:23 1768019363

I can read and understand e.g. Python, but I have seen subtle bugs that were hard to spot in code generated by AI. At least the last time I tried coding agents (mid 2025), it was often easier to write the code myself then play "spot the bug" with whatever was generated. I don't know anything about Lean, so I was wondering if there were similar pitfalls here.

mkehrt · 2026-01-10T04:51:12 1768020672

In this case the human written statement of the theorem is small. Can there be bugs? Absolutely! But it's essentially a few lines of code worth of thinking.

The lean proof checker then checks to make sure the proof actually proves the statement.

In this case an AI is generating the proof, but if it "compiles" it's correct. The only thing humans need to check is the statement to be proven.

(I don't know anything about this project but I've played around with lean and used other proof checkers more sesrisously).

rhdunn · 2026-01-10T11:01:34 1768042894

This is also an issue with non-AI and non-Lean proofs. Andrew Wiles' initial Fermat's Last Theorem proof initially had an error in it. That was spotted by peer review, fixed, and an updated proof was submitted.

saghm · 2026-01-10T07:22:38 1768029758

I don't disagree with you, but on the other hand, I feel the same way about code written by other humans, and that's not because they're necessarily worse at writing code than me, but because for code I've written myself, I've already spent time thinking about it, so I don't have to start from scratch when re-reading it. It's also not like I don't think I potentially write as many bugs as my coworkers either; it's just easier for me to catch my own up front as I'm coding than it is to catch theirs in code review. The two main differences are that I can have a more meaningful conversation with my coworkers about their approach, what bugs they might think are worth looking out for, etc. compared to an LLM (which in my experience will claim completely incorrect things about the code it wrote far more often than any engineer I've worked with even junior ones; the humans I've worked with have uniformly been able to report how confident they are in what they've produced being what they were tasked with without insane exaggerations), and that an LLM can produce a much higher volume of plausible-enough looking code in a given unit of time than most humans I've worked with. It's not obvious to me that these would be particularly severe problems in generating proofs though; unless the proof is so large that it would be infeasible to read through it in a reasonable amount of time, I would expect mathematicians to be able to make up for the lower quality conversations with the "author" by devoting more time to reading and thinking, or having someone else also read through the proof and talking through it with them. If anything, it's possible that the timeline for writing up a paper about the results might be better for some mathematicians than the equivalent amount of time most engineers have to spend reviewing code before the pressure to get it merged and move on to the next thing. (I'm aware that there is certainly pressure to get things published in academia, but I don't have firsthand experience, so I've tried to be intentional in how I've worded this to clarify that I want to avoid any assumptions about what the norms would be, but given the relatively wide range of time pressure that engineers might experience across the industry as a whole, I'd expect that at least some mathematicians might have some flexibility to spend extra time reading through an LLM-written proof, especially if it might be time they'd otherwise have to spend trying to come up with the proof themselves).

Jaxan · 2026-01-10T08:46:29 1768034789

If you want to check the statement, you only have to read the type. The proof itself you don’t have to read at all

bryanrasmussen · 2026-01-10T12:47:16 1768049236

As I understand it Lean is not a general purpose programming language, it is a DSL focused on formal logic verification. Bugs in a DSL are generally easier to identify and fix.

It seems one side of this argument desperately needs AI to have failed, and the other side is just saying that it probably worked but it is not as important as presented, that it is actually just a very cool working methodology going forward.

baq · 2026-01-10T19:30:14 1768073414

Lean 4 is a bit awkward, but workable as a general purpose programming language, it e.g. supports sockets (with a C module, but so does Python.)

threatofrain · 2026-01-10T14:47:34 1768056454

It’s far easier for Lean because the human has to read very little compared to generating whole programs.

luckydata · 2026-01-10T07:11:02 1768029062

the same way you verify that any other program compiles? I don't understand the question tbh, it seems self evident.

NetMageSCW · 2026-01-10T20:48:09 1768078089

Compiling isn’t sufficient because it doesn’t tell you if the program matches the specification. A program that always says the temperature is 80 F will compile but is a terrible solution to what is the temperature outside at this location right now.

oh_my_goodness · 2026-01-10T16:53:11 1768063991

>That's what's covered by the "assuming you have formalized the statement correctly" parenthetical.

Sure. But it's fair to ask how to validate that assumption.

fsmv · 2026-01-10T16:58:08 1768064288

Skilled humans must understand the problem and write the theorem statement.

f1shy · 2026-01-10T06:46:40 1768027600

> This is not AI,

A little bit nitpicking, but according to books like AIMA that is indeed AI. In the first chapter even any control system is classified as AI.

Because of the reasons stated in the 1st chapter, I totally agree with the authors.

The whole system is AI. That part is a verifier in a chain of “suggestions/instict -> verifier” like used in neurosymbolic systems for automated driving, for example.

exe34 · 2026-01-10T12:12:01 1768047121

the argument here is that:

1. you write a proof in English that there is an infinite number of primes. 2. the llm writes 2+2=4 in lean. 3. lean confirms that this is correct and it's impossible that this proof is wrong.

NetMageSCW · 2026-01-10T20:50:37 1768078237

You missed a whole section - a person creates a Lean formalization of #1 and Lean promptly says the AI proof is wrong because it doesn’t prove that formal problem statement.

The question is in the person (or AI) creating the formal problem statement - how do you know it represents the problem the proof is supposed to be for? And the answer is for people in the field, in this case, formalizing the problem and verifying the formalization is easy. It is like generating an public key versus factoring it.

exe34 · 2026-01-10T22:39:38 1768084778

I thought that's what I was trying to express between lines 1 and 2 above, but I may have failed to get it across. my understanding is that the danger is that the llm will create a proof that is correct but isn't about what the person thinks he's proving?

heroprotagonist · 2026-01-10T08:39:36 1768034376

Soo, it can definitively tell you that 42 is correct Answer to the Ultimate Question of Life, The Universe, and Everything. It just can't tell you if you're asking the right question.

Someone · 2026-01-10T08:56:30 1768035390

No, it can tell you that 42 is the answer to (some lean statement), but not what question that lean statement encodes.

davidwritesbugs · 2026-01-10T10:36:39 1768041399

I recently learnt that Douglas Adams wrote code and the ultimate answer was '*' - 42 in ascii

bee_rider · 2026-01-10T18:28:47 1768069727

How hard is it to go back to English, from Lean? Just as hard as going from English to Lean?

If it is easier to convert backwards, maybe the AI can at least describe what the equations mean…

rurban · 2026-01-10T08:02:24 1768032144

Lean is doing logical AI, the classical AI part.

Aristotle is doing the matching AI part, the modern LLM approach, previously called fuzzy logic.

Both are AI.

teiferer · 2026-01-10T08:30:31 1768033831

Calling Lean "AI" is quite a stretch. Though I'm also in the camp that dislikes the inflationary use of "AI" for LLMs, so I have sympathies for your viewpoint.

baq · 2026-01-10T09:20:54 1768036854

Finding a path in a maze was AI once.

chpatrick · 2026-01-10T15:03:21 1768057401

I think that definition is pretty obsolete for the last 20 years.

To me "AI" is machine learning, statistical algorithms trained on data. That's not true for Lean.

baq · 2026-01-10T19:02:12 1768071732

So basically anything we don’t know how to write an algorithm for? I see where you’re coming from - but at the same time it’s actually an AI meme and smells of permanently moving goalposts.

sibrahim · on July 28, 2023

XOR is not a great choice here. Consider that 2 copies of a row give the same result as 0 (or 4, 6, etc). And even without multiple copies of rows, you can force any hash you'd like by observing what happens when you insert more random rows and finding a subcollection that flips exactly the bits you want.

What you probably want to look at is homomorphic hashing. This is usually implemented by hashing each row to an element of an appropriate abelian group and then using the group operation to combine them.

With suitable choice of group, this hash can have cryptographic strength. Some interesting choices here are lattices (LtHash), elliptic curves (ECMH), multiplicative groups (MuHash).

Someone · on July 28, 2023

> XOR is not a great choice here. Consider that 2 copies of a row give the same result as 0 (or 4, 6, etc).

That indeed is a major flaw. You have to use another commutative operation that doesn’t destroy entropy. Addition seems a good choice to me.

> And even without multiple copies of rows, you can force any hash you'd like

I don’t see how that matters for this problem.

sibrahim · on July 28, 2023

It depends on whether you are doing something security critical with the result.

Maybe you have a trusted table hash but only a user-supplied version of the table. Before you use that data for security sensitive queries, you should verify it hasn't been modified.

Basically, if you ever have to contend with a malicious adversary, things are more interesting as usual. If not, addition is likely fine (though 2^k copies of a row now leave the k lowest bits unchanged).

sibrahim · on April 22, 2022

Sometimes these are officially sanctioned and even required! Most notably, including logos/other embellishments are a form of "trap street" where the idea is to have something to point to that trivially proves someone is using your mask design if it comes to it.

sibrahim · on April 16, 2022

The usual next difficulty is that the maximum shebang length is fairly low (truncated at 128 chars, IIRC)

lucb1e · on April 16, 2022

It was funny to notice my mental knee-jerk reaction of "ought to be enough for anyone" (referring to 640K RAM). I've never used args with env until last week in the first place and even then it was a (mental) debate of solving it in code (wrapping print() in a function that also flushes stdout) or adding the argument instead, so I guess that's why that was my initial thought. But of course that doesn't mean there can't be legitimate uses of long argument lists. Java code comes to mind (although I do question the sanity of many things Java).

sibrahim · on April 10, 2022

My understanding is that was the original purpose, but probably unnecessary given that all the relevant process steps are encased in tools that are not exposed to the general clean room environment in operation (they have to maintain a higher grade of clean room internally).

However, I suspect it's probably going to continue being propagated forward due to Copy Exactly which has historically meant that Intel documents and copies the smallest details in all their fabs just in case they are necessary. In the end, if it ain't broke...

mrlonglong · on April 10, 2022

In short, they've cargo culted themselves. No wonder they've lost their place as makers of powerful processors.

toss1 · on April 10, 2022

Chesterton's Fence [0] should overrule fear of cargo culting, especially when the fence (or in this case color) was clearly put up by people who really knew what they were doing at the time.

Unless there is a significant cost to it, and unless the benefit is understood, there is no reason that they should be faulted as ignorant cargo cultists simply for keeping it in place.

[0] https://www.chesterton.org/taking-a-fence-down/

sibrahim · on April 10, 2022

Right. There's not a compelling reason to switch and there's more than a hint of CYA mixed in with Chesterton's Fence:

https://www.bloomberg.com/news/articles/2016-06-09/how-intel...

> For decades, Intel’s cleanrooms have been lit like darkrooms, bathed in a deep, low yellow. “That’s an anachronism,” says Mark Bohr, a small, serious man who has spent his entire 38-year career making chips, and who’s now Intel’s top manufacturing scientist. “Nobody’s had the courage to change it.”

boondaburrah · on April 10, 2022

Why would you not want the additional protection if you have to pick a colour to make the room anyways?

mrlonglong · on April 10, 2022

Copying everything implies they do not have full understanding of what makes for a successful process.

speed_spread · on April 10, 2022

It's possible that some level of unresolved empirical "magic" be involved in the process, which does not diminish the value of the end product. Why deprive yourself of hard learned lessons even if you don't understand it all?

sibrahim · on June 5, 2021

If I wanted to keep the blur aesthetic, I'd probably do a full removal, run an inpainting algorithm to replace the removed region with something less jarring in context and then blur the result. The inpainting algorithm can be fairly low quality and still get acceptable results since it won't be seen directly.

sibrahim · on Feb 17, 2021

My preferred solution has been git-subrepo: https://github.com/ingydotnet/git-subrepo

Basically is what (I think) submodules should have been. Creates a vendored copy with metadata about what commit it came from. Normal operations like clone, commit (even touching multiple main/subrepo files which subtree struggles with) are unaffected (normal files from git's point of view).

Pull, push, branching all work as expected on main project (most devs don't even need to know there's a subrepo). If you want to pull/push vendored changes from subrepo, there's a new command, but that's it.

junon · on Feb 17, 2021

This just described every feature that submodules gives you out of the box. I'm curious what value this adds, if any...

sibrahim · on Feb 18, 2021

As Izkata mentions, the superproject has _all_ the files for a given commit without any need for users to have access other repos, additional submodule init commands, etc.

Basically, after a subrepo clone, you've copied the file tree for the subrepo and can make commits on it in the superproject to your heart's content (branching, etc). This is basically a fork/mirror, but adds a single metadata file to track the last upstream commit you pushed/pulled from to allow reconciling later. So with git subrepo, you make commits in the superproject first and can choose (at some later point, if at all) to merge with the subrepo upstream. This is arguably consistent with the git model writ large (make local commits, later choose if/how to integrate those with upstream). Importantly, people that clone your superproject repo don't have to know anything about subrepo or special commands to send changes back to you.

For submodules, changes flow in the other direction. If you want to make a change to the child repo, you must 1) commit in the submodule, 2) push the submodule commit to its upstream, 3) make a commit in the super project that changes the commit the submodule is pointing to. When someone pulls, switches branches, etc in the superproject, they need to do a submodule update with various failure modes or else they end up with empty/out-of-date content.

Subtrees are a bit similar to subrepos, but in practice you still need to be aware of their boundaries since you can't mix subtree/superproject modifications in the same commit. Moreover, you need to use a special merge strategy rather than git default merge/rebase which subrepo uses.

Izkata · on Feb 18, 2021

I can't answer in general, but I do see one possible misunderstanding: "vendored" means the code is copied into the parent repo.

If a submodule's original source goes offline, a new clone of the parent repo won't be able to retrieve it, since it just stores a reference to where to clone from. If subrepo really vendors the code, that'll never be a problem since a full copy is committed directly to the parent repo.

junon · on Feb 18, 2021

This is what mirroring or forking is for...

sibrahim · on June 5, 2020

There's no need to store all previous states to detect a cycle: it can be done using just twice the original memory.

You can model the state transitions as a linked list and use Floyd's classical tortoise and hare algorithm to (eventually) determine a cycle exists. Initialize "tortoise" and "hare" as two copies of initial state and advance hare by two instructions and tortoise by one each iteration. A cycle is reported if the two states become equal again.

jwalton · on June 6, 2020

It's true your solution will work, but you're essentially trading space complexity for time complexity. With 2^524288 states, instead of requiring more RAM than there are atoms in the universe worth of RAM, you'll now need more seconds of compute time than we have left before the heat death of the universe.

sibrahim · on June 6, 2020

Note that the time taken is linear with respect to the original execution.

A cycle of n instructions starting at instruction n0 will be detected in between n0 and n0+n iterations (i.e. <= 3*(n0+n) underlying instructions) since n0 iterations gets both into the cycle and the offset between them will become 0 some time in the next n iterations.

n could be very large, of course (e.g, using all of memory as a giant counter so the cycle length is huge), but the cycle detection is not really making your problem worse.

jerf · on June 6, 2020

Thank you. This is the sort of thing I was alluding to when I suggested it was harder than just doing a simple replacement on the standard halting argument.

In this case, I'll back it down to suggesting there's probably some way to prove it can't be done without some unreasonable amount of at least one of time and space.

sibrahim · on Dec 28, 2016

I've had success using \import{subdir/}{BasenameInSubdir} from the import package.

The basic (and unfortunate in a xkcd/1479 way) issue appears to be that the arXiv compile server doesn't allow writing to subdirectories. A subdir \include implicitly requires write access for the aux file.

sibrahim · on Dec 27, 2016

That's certainly been my experience.

I've had arXiv automatically block bare uploads of TeX-derived PDFs (presumably identified through PDF metadata). For these, it required that the source be uploaded and compiled on their server.

On balance, it's probably best to require source as arXiv does, but this can create interesting issues from time to time. Since the source is downloadable, researchers can inadvertently end up sharing partial results or snark that was commented out in the TeX source.