<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://mikedodds.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://mikedodds.github.io/" rel="alternate" type="text/html" /><updated>2025-10-08T02:20:20+00:00</updated><id>https://mikedodds.github.io/feed.xml</id><title type="html">Mike Dodds</title><subtitle></subtitle><author><name>Mike Dodds</name><email>miked@galois.com</email></author><entry><title type="html">Experimenting with ACL2 and Claude Code</title><link href="https://mikedodds.github.io/posts/2025/10/claude-code-acl2-experiments" rel="alternate" type="text/html" title="Experimenting with ACL2 and Claude Code" /><published>2025-10-07T00:00:00+00:00</published><updated>2025-10-07T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2025/10/claude-code-acl2-experiments</id><content type="html" xml:base="https://mikedodds.github.io/posts/2025/10/claude-code-acl2-experiments"><![CDATA[<p><strong>TL;DR:</strong> Using only prompting with Claude Code, I created:</p>
<ul>
  <li><a href="https://github.com/septract/acl2-swf-experiments">50+ ACL2 theorem proofs</a> translated from <a href="https://softwarefoundations.cis.upenn.edu/">Software Foundations</a></li>
  <li><a href="https://github.com/septract/acl2-mcp">An MCP server for ACL2</a> with stateful solver sessions</li>
</ul>

<p>Total time: ~3-4 hours. I wrote zero lines of code by hand.</p>

<hr />

<p>AI models are getting more powerful, but most training data is in high-representation languages like Python. That’s a potential issue for formal methods, where most languages have lower representation in training corpora. I’d <a href="https://www.galois.com/articles/claude-can-sometimes-prove-it">already experimented with Lean</a> and gotten interesting results, but after a discussion with some colleagues about lower-representation ITPs, I got curious about <a href="https://www.cs.utexas.edu/users/moore/acl2/">ACL2</a>. It’s lower-representation compared to Lean, with quite a different idiomatic way of doing proof, and I’d never used it before.</p>

<p>I played around with Claude Code for a few hours and got <a href="https://github.com/septract/acl2-swf-experiments">50+ theorems to go through</a>, progressing from basic arithmetic properties through induction, list operations, and finally polymorphic higher-order functions. The examples were lifted from <a href="https://softwarefoundations.cis.upenn.edu/">Software Foundations</a> book 1, selected and translated by Claude Code through prompting alone.</p>

<p>If you want to play with ACL2 yourself, you might find the <a href="https://github.com/septract/acl2-swf-experiments/blob/main/CLAUDE.md">CLAUDE.md</a> and <a href="https://github.com/septract/acl2-swf-experiments/blob/main/notes/acl2-quick-reference.md">notes/acl2-quick-reference.md</a> files useful (also written by Claude based on reading web resources).</p>

<h2 id="building-an-mcp-server-for-acl2">Building an MCP Server for ACL2</h2>

<p>As I worked on more complex proofs, I realized Claude Code would benefit from better tooling for iterative theorem development. So I had it build <a href="https://github.com/septract/acl2-mcp">an MCP server for ACL2</a> in Python—again, entirely through prompting. The server provides 15 different tools for interacting with ACL2, but the key sophistication is persistent session management with checkpoint/rollback support. This lets you save your prover state before attempting risky proof steps, then roll back if they don’t work out. I built this in a couple of phases:</p>

<ol>
  <li>I asked Claude Code to make a beta</li>
  <li>I used it with Claude Code in a separate session to write ACL2 proofs, and asked Claude Code for feedback on missing features</li>
  <li>I told Claude Code to fix the feedback</li>
</ol>

<p>I also asked Claude Code for security audits at two points, which resulted in a lot of extra input validation. It’s still unsafe really just because ACL2 isn’t containerized. That’d be reasonably easy to do but feels like overkill for this experiment.</p>

<h2 id="what-this-means">What This Means</h2>

<p>The fact that Claude Code could handle ACL2—a language it has relatively little training data for, with quite different idioms from Lean—suggests that representation gap might matter less than we’d expect for current frontier models. At least for introductory-level theorems, the models seem able to pick up the syntax and proof patterns through in-context learning.</p>

<p>Of course, this was mostly straightforward material from an introductory textbook. The harder question is whether these techniques scale to research-level mathematics or industrial verification. But it’s encouraging to find that the barrier to entry is now “spend a few hours prompting” rather than “spend weeks learning a new proof assistant.”</p>

<p>The most interesting moment was watching Claude Code struggle with the <code class="language-plaintext highlighter-rouge">fold-product-append</code> theorem. When I asked it to reflect on the experience afterward, here’s what it said:</p>

<blockquote>
  <p>“This was genuinely challenging—it’s not just about finding the right lemmas, but understanding how ACL2’s rewriter works under the hood. The theorem proves that <code class="language-plaintext highlighter-rouge">fold-product (append l1 l2) = (fold-product l1) * (fold-product l2)</code>, which requires showing that multiplication can be rearranged across nested structures. The key insight was that ACL2’s commutativity rules were being too aggressive, normalizing terms before the arithmetic reasoning could apply. The solution required selective theory control: globally disabling <code class="language-plaintext highlighter-rouge">commutativity-of-*</code> during the main induction, then re-enabling both commutativity and associativity at the specific subgoal (Subgoal *1/3’’) where they’re needed. This demonstrates something interesting about interactive theorem proving—sometimes the prover knows the facts it needs, but you have to carefully orchestrate when those facts are applied. The proof failed with multiple simpler strategies (direct hints, helper lemmas, manual instantiation) before this approach worked.”</p>
</blockquote>

<p>I think that’s a pretty interesting reflection on what makes proof engineering tricky.</p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="ai" /><category term="formal-methods" /><category term="interactive-theorem-proving" /><summary type="html"><![CDATA[TL;DR: Using only prompting with Claude Code, I created: 50+ ACL2 theorem proofs translated from Software Foundations An MCP server for ACL2 with stateful solver sessions]]></summary></entry><entry><title type="html">Claude Can (Sometimes) Prove It</title><link href="https://mikedodds.github.io/posts/2025/09/claude-can-sometimes-prove-it" rel="alternate" type="text/html" title="Claude Can (Sometimes) Prove It" /><published>2025-09-16T00:00:00+00:00</published><updated>2025-09-16T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2025/09/claude-can-sometimes-prove-it</id><content type="html" xml:base="https://mikedodds.github.io/posts/2025/09/claude-can-sometimes-prove-it"><![CDATA[<p>Galois blog post: <a href="https://www.galois.com/articles/claude-can-sometimes-prove-it">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">Specifications Don’t Exist</title><link href="https://mikedodds.github.io/posts/2025/06/specifications-dont-exist" rel="alternate" type="text/html" title="Specifications Don’t Exist" /><published>2025-06-16T00:00:00+00:00</published><updated>2025-06-16T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2025/06/specifications-dont-exist</id><content type="html" xml:base="https://mikedodds.github.io/posts/2025/06/specifications-dont-exist"><![CDATA[<p>Galois blog post: <a href="https://www.galois.com/articles/specifications-dont-exist">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">What Works (and Doesn’t) Selling Formal Methods</title><link href="https://mikedodds.github.io/posts/2025/05/what-works-and-doesnt-selling-formal-methods" rel="alternate" type="text/html" title="What Works (and Doesn’t) Selling Formal Methods" /><published>2025-05-08T00:00:00+00:00</published><updated>2025-05-08T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2025/05/what-works-and-doesnt-selling-formal-methods</id><content type="html" xml:base="https://mikedodds.github.io/posts/2025/05/what-works-and-doesnt-selling-formal-methods"><![CDATA[<p>Galois blog post: <a href="https://www.galois.com/articles/what-works-and-doesnt-selling-formal-methods">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">o3, Frontier Math, and the Future of Mathematics (Galois blog)</title><link href="https://mikedodds.github.io/posts/2024/11/o3-frontier-math-and-the-future-of-mathematics" rel="alternate" type="text/html" title="o3, Frontier Math, and the Future of Mathematics (Galois blog)" /><published>2025-01-29T00:00:00+00:00</published><updated>2025-01-29T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2024/11/o3-frontier-math-and-the-future-of-mathematics</id><content type="html" xml:base="https://mikedodds.github.io/posts/2024/11/o3-frontier-math-and-the-future-of-mathematics"><![CDATA[<p>Galois blog post: <a href="https://www.galois.com/articles/o3-frontier-math-and-the-future-of-mathematics/">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">Function Argument Nullability Using an LLM (Galois blog)</title><link href="https://mikedodds.github.io/posts/2024/11/function-argument-nullability-using-an-llm" rel="alternate" type="text/html" title="Function Argument Nullability Using an LLM (Galois blog)" /><published>2024-11-20T00:00:00+00:00</published><updated>2024-11-20T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2024/11/function-argument-nullability-using-an-llm</id><content type="html" xml:base="https://mikedodds.github.io/posts/2024/11/function-argument-nullability-using-an-llm"><![CDATA[<p>Galois blog post: <a href="https://galois.com/articles/function-argument-nullability-using-an-llm/">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">Generative AI for Specifications (Galois blog)</title><link href="https://mikedodds.github.io/posts/2024/06/generative-ai-for-specifications" rel="alternate" type="text/html" title="Generative AI for Specifications (Galois blog)" /><published>2024-06-04T00:00:00+00:00</published><updated>2024-06-04T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2024/06/generative-ai-for-specifications</id><content type="html" xml:base="https://mikedodds.github.io/posts/2024/06/generative-ai-for-specifications"><![CDATA[<p>Galois blog post: <a href="https://galois.com/articles/generative-ai-for-specifications/">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">Galois / Twisp: Avoiding Foolishness in Distributed Systems (Galois blog)</title><link href="https://mikedodds.github.io/posts/2024/02/avoiding-foolishness-distributed-systems" rel="alternate" type="text/html" title="Galois / Twisp: Avoiding Foolishness in Distributed Systems (Galois blog)" /><published>2024-02-22T00:00:00+00:00</published><updated>2024-02-22T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2024/02/avoiding-foolishness-distributed-systems</id><content type="html" xml:base="https://mikedodds.github.io/posts/2024/02/avoiding-foolishness-distributed-systems"><![CDATA[<p>Galois blog post: <a href="https://galois.com/articles/galois-twisp-avoiding-foolishness-in-distributed-systems/">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">The Impact of Provable Security: AWS and Supranational (Galois blog)</title><link href="https://mikedodds.github.io/posts/2023/09/the-impact-of-provable-security" rel="alternate" type="text/html" title="The Impact of Provable Security: AWS and Supranational (Galois blog)" /><published>2023-09-19T00:00:00+00:00</published><updated>2023-09-19T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2023/09/the-impact-of-provable-security</id><content type="html" xml:base="https://mikedodds.github.io/posts/2023/09/the-impact-of-provable-security"><![CDATA[<p>Galois blog post: <a href="https://galois.com/articles/the-impact-of-provable-security-aws-and-supranational/">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry><entry><title type="html">Building a Concurrency Verifier Using Crucible (Galois blog)</title><link href="https://mikedodds.github.io/posts/2021/06/building-a-concurrency-verifier" rel="alternate" type="text/html" title="Building a Concurrency Verifier Using Crucible (Galois blog)" /><published>2021-06-18T00:00:00+00:00</published><updated>2021-06-18T00:00:00+00:00</updated><id>https://mikedodds.github.io/posts/2021/06/building-a-concurrency-verifier</id><content type="html" xml:base="https://mikedodds.github.io/posts/2021/06/building-a-concurrency-verifier"><![CDATA[<p>Galois blog post: <a href="https://www.galois.com/articles/building-a-concurrency-verifier-using-crucible">Link</a></p>]]></content><author><name>Mike Dodds</name><email>miked@galois.com</email></author><category term="galois" /><summary type="html"><![CDATA[Galois blog post: Link]]></summary></entry></feed>