laarctags | new | comments | ask | show | place | submitlogin
Reduced Binary Seed Bootstrap (
5 points by rain1 to bootstrapping programming freesoftware on Jan 29, 2019 | 5 comments

Although this is good work, I feel there might also need to be a very, C-like route for this, too. Most C programmers neither know nor will learn Lisp-like languages. They'll just be trusting Schemers (hey, that sounds nice!) like they were trusting C or C++ coders. Projects like SmallerC make a simple compiler for a subset of C that they might be able to follow. I still think there's room for simple, C-like languages or subsets that compile early version of GCC. One idea maximizing simplicity was an interpreter for a C-like language with just expressions, while, function calls, pointers, I/O, and modules. It's written in a mix of literate pseudo code, source code in language w/ strong verification tooling, and assembly optimized for readability.

From there, a source-to-source transpiler using any technology you want. It's untrusted. It converts legacy C into the simpler language... the interpreted one or one like SmallerC... with a human eyeballing the output to make sure the commands match. It will preserve as much structure and comments as it can. If it needs to change structure, it does changes one-by-one per file in the style of diffs so they can be reviewed in isolation. As always, equivalence testing is done using automated, test generators making sure outputs are the same between original and final version. Note the interpreter can be turned into a simple compiler by, during program execution, just outputting the assembly that's executed.

So, they have to learn just a simple interpreter, run the equivalence tests for initial confidence, and then eyeball the before and after diffs. They can also pick their battle: SmallerC, TCC, GCC... works on anything that's C. Thoughts?


3 points by rain1 on Jan 29, 2019

Yes, specifically what I have been thinking about a lot recently is:

* Implement forth in assembly.

* Implement a LALR (or similar) parser generator in forth.

* Implement a compiler for a c-light language that targets forth in forth.

The compiler can be one of those very basic ones that fuses everything from parsing to codegen into a single pass. The resulting code will run on top of the forth platform. Then we can compile a c-light in c-light, standard library everything.

The problem is I don't know forth! I can't decide whether I should study it and eventually work on this or just hope somebody else tries it out.


That sounds good. We even have Forth on Miraheze for that reason. There was also a certifying compiler from Java to Forth for Open Firmware. I ended up avoiding a Forth-based solution for the same reason as Scheme. I think you already see it:

"The problem is I don't know forth! I can't decide whether I should study it and eventually work on this or just hope somebody else tries it out."

Like Scheme, I think most C programmers will say the same thing about Forth. Its primitives are closer to their understanding being stack operations. I'd rather just stick with what they know. The approach closest to it was to do term-rewriting, metaprogramming, or something on AST's for a C-like language. You'd be doing what you do in Scheme or Forth but disguised as not Scheme or Forth. It has to feel like what they know with minimal, extra stuff.


Ah, this explains much of the discussion at #bootstrappable.


See also this page [1] where a lot of early stuff was discussed plus piles of links to help bootstrappers.


Welcome | Guidelines | Bookmarklet | Feature Requests | Source | API | Contact | Twitter | Lists

RSS (stories) | RSS (comments)