||2 years ago|
|doc||2 years ago|
|images||2 years ago|
|runtime||2 years ago|
|src||2 years ago|
|test||2 years ago|
|.clang-format||2 years ago|
|.gitignore||2 years ago|
|BuildAndRunTests.sh||2 years ago|
|BuildHotReloadLib.sh||2 years ago|
|Jamfile||2 years ago|
|Jamrules||2 years ago|
|LICENSE||2 years ago|
|ReadMe.org||2 years ago|
- Building Cakelisp itself
- Building a project using Cakelisp
- Tooling support
- Why Lisp?
- Technical overview
- Similar applications/languages
This is a Lisp-like language where I can have my cake and eat it too. I wanted to do this after my LanguageTests experiment revealed just how wacky Common Lisp implementations are in regards to performance. I was inspired by Naughty Dog's use of GOAL, GOOL, and Racket/Scheme (on their modern titles).
The goal is a metaprogrammable, hot-reloadable, non-garbage-collected language ideal for high performance, iteratively-developed programs (especially games).
It is a transpiler which generates C/C++ from a Lisp dialect.
The metaprogramming capabilities of Lisp: True full-power macro support and compile-time code execution
The performance of C: No heavyweight runtime, boxing/unboxing overhead, etc.
"Real" types: Types are identical to C types, e.g.
intis 32 bits with no sign bit or anything like other Lisp implementations do
No garbage collection: I can handle my own memory. I primarily work on games, which make garbage collection pauses unacceptable. I also think garbage collectors add more complexity than manual management
Hot reloading: It should be possible to make modifications to functions and structures at runtime to quickly iterate
Truly seamless C and C++ interoperability: No bindings, no wrappers: C/C++ types and functions are as easy to declare and call as they are in C/C++. In order to support this, I've decided to ignore type deduction when possible and instead rely on the C compiler/linker to relay typing errors. Cakelisp will blindly generate what look like C/C++ function calls without knowing if that function actually exists, because the C/C++ compiler will tell us what the answer is
Output is human-readable C/C++ source and header files. This is so if I decide it was unsuccessful, or only useful in some scenarios (e.g. generating serialization wrappers), I can still use the output code from hand-written C/C++ code
Many of these come naturally from using C as the backend. Eventually it would be cool to not have to generate C (e.g. generate LLVM bytecode instead), but that can a project for another time.
Building Cakelisp itself
sudo apt install jam
Run jam in
4 is the number of cores to use while compiling).
You can also use the
It shouldn't be hard to build Cakelisp using your favorite build system. Simply build all the
.cpp files in
src and link them into an executable. Leave out
Main.cpp and you can embed Cakelisp in a static or dynamic library!
Currently, Cakelisp has no dependencies other than:
C++ STL and runtime: These are normally included in your toolset
Child-process creation: On Linux,
unistd.h. On Windows,
Dynamic loading: On Linux,
libdl. On Windows,
File modification times: On Linux,
C++ compiler toolchain: Cakelisp needs a C++ compiler and linker to support compile-time code execution, which is used for macros and generators
I'm going to try to keep it very lightweight. It should make it straightforward to port Cakelisp to other platforms.
Note that your project does not have to include or link any of these unless you use hot-reloading, which requires dynamic loading. This means projects using Cakelisp are just as portable as any C/C++ project - there's no runtime to port (except hot-reloading, which is optional).
Building a project using Cakelisp
Building is expected to have two phases:
Run Cakelisp on
.cakefiles, which creates C/C++ header and source files. Cakelisp has a Python-style module system which will automatically evaluate and generate the output of imported Cakelisp files as necessary
Build generated files using a conventional build system. Whatever you use currently should likely work already (I use Jam)
One advantage of this setup is that you could decide to abandon Cakelisp and still have useful C/C++ code left over. It also means you don't need to add special support to your build system for
C or C++?
Cakelisp itself is written in C++. Macros and generators must generate C++ code to interact with the evaluator.
However, you have more options for your project's generated code:
Only C: Generate pure C. Error if any generators which require C++ features are invoked
Only C++: Assume all code is compiled with a C++ compiler, even if a Cakelisp module does not use any C++ features
Mixed C/C++, warn on promotion: Try to generate pure C, but if a C++ feature is used, automatically change the file extension to indicate it requires a C++ compiler (
.cpp) and print a warning so the build system can be updated
I may also add declarations which allow you to constrain generation to a single module, if e.g. you want your project to be only C except for when you must interact with external C++ code.
Generators keep track of when they require C++ support and will add that requirement to the generator output as necessary.
Hot-reloading won't work with features like templates or class member functions. This is partially a constraint imposed by dynamic loading, which has to be able to find the symbol. C++ name mangling makes that much more complicated, and compiler-dependent.
I'm personally fine with this limitation because I would like to move more towards an Only C environment anyway. This might be evident when reading Cakelisp's source code: I don't use
class, define new templates, or define struct/class member functions, but I do rely on some C++ standard library containers and
.cake files in
(add-to-list 'auto-mode-alist '("\\.cake?\\'" . lisp-mode))
A build system will work fine with Cakelisp, because Cakelisp outputs C/C++ source/header files. Note that Cakelisp is expected to be run before your regular build system runs, or in a stage where Cakelisp can create and add files to the build. This is because Cakelisp handles its own modules such that adding support to an existing build system would be challenging.
See doc/Debugging.org. Cakelisp doesn't really have an interpreter. Cakelisp always generates C/C++ code to do meaningful work. This means the Cakelisp transpiler, macros, generators, and final code output can be debugged using a regular C/C++ debugger like GDB, LLDB, or Visual Studio Debugger.
Mapping files will make it possible to step through code in the Cakelisp language (i.e. not in the generated language). This is similar to how debuggers allow you to step through code in C files, when under the hood it's actually stepping through machine code. It will require building support into your editor in order to properly jump to the right Cakelisp file and line (among other things).
The primary benefit of using a Lisp S-expression-style dialect is its ease of extensibility. The tokenizer is extremely simple, and parsing S-expressions is also simple. This consistent syntax makes it easy to write macros, which generate more S-expressions.
Additionally, S-expressions are good for representing data, which means writing domain-specific languages is easier, because you can have the built-in tokenizer do most of the work.
It's also a reaction to the high difficulty of parsing C and especially C++, which requires something like libclang to sanely parse.
In very broad phases, this is what Cakelisp does/is:
Tokenizer and evaluator written in C++
Export evaluated output to C/C++
Compile generated C/C++
Compile-time execution: generators and macros
Cakelisp itself is extended via "generators", which are functions which take Cakelisp tokens and output C/C++ source code. Because generators are written in C++, generators can also be written in Cakelisp! Cakelisp will compile the generators in a module into a dynamic library, then load that library before continuing parsing the module.
Macros are similar to generators, only they output Cakelisp tokens instead of C/C++ code. Macro definitions also get compiled to C/C++, using the same generators which compile regular Cakelisp functions. Macros in Cakelisp are much more powerful than C's preprocessor macros, which can only do simple text templating. For example, you could write a Cakelisp macro which generates functions conditionally based on the types of members in a struct.
The only thing the evaluator meaningfully does is call C/C++ functions based on the original or macro-generated Cakelisp tokens. There is no interpreter - compile-time code must be compiled before it can be executed.
.cakefile into Token array
Iterate through token array, looking for macro/generator definitions
If there are macro/generator definitions, generate code for those definitions, compile it, load it via dynamic linking, then add it to the environment's macro/generator table. Base-level generators are written in C++ to bootstrap the language
Iterate through token array, looking for macro/invocations
Run macro/generator as requested by invocation
Return to step 2 in case generators created generators
Once no generators are invoked, output the generator operations
From generator operations, create C/C++ header and source files, as well as line mapping files. Mapping files will record C source location to Cakelisp source location pairs, so debuggers, C compiler errors etc. all map back to the Cakelisp that caused that line
Compile generated C/C++ files. If there are warnings or errors, use the mapping file to associate them back to the original Cakelisp lines that caused that code to be output
This is somewhat inaccurate. The pipeline is a bit more complicated:
For each file (module) imported or included in the Cakelisp command
Tokenize and evaluate the module, making note of all unknown references (any function invocation not already in the environment)
After all modules are evaluated, resolve references
Resolving references involves multiple stages:
Determine which definitions (macros, generators, and functions) need to be built
For each required definition, determine if it can be built (if all its references are loaded)
Build all required definitions which can be built, guessing whether unknown references are C/C++ function calls
For all definitions which are built successfully, resolve references to those definitions (evaluate knowing now what the reference is; macros, generators, and C/C++ function invocations all have different paths)
Return to step 1 because definitions and references to them can create new definitions which resolve other references
The "guessing" part of the resolving references stage is something I think is unique to Cakelisp. In order to avoid requiring bindings, Cakelisp must guess as to whether an invocation is a valid C/C++ function call. When the guess is incorrect, Cakelisp will not try to compile the referent definition until something about the environment changes, which makes the chances of a successful compilation for that definition increase. I call this "speculative compilation".
The drawback to speculative compilation is costly failed compilations, but they can be minimized if hints are added. Additionally, it is only necessary during clean builds - partial builds will use definitions which have already been compiled. In this way, compile-time code execution can be imagined as extensions to the Cakelisp transpiler, written inline with "shipping" code.
In Naughty Dog's Uncharted (and possibly other titles), Scheme is used to generate C structure definitions (and do various other things). See Jason Gregory's Game Engine Architecture, p. 257. See also: Dan Liebgold - Racket on the Playstation 3? It's Not What you Think!
Some Lisp-family languages with active development which transpile to C:
Chicken scheme: Transpiles to C. Has heavyweight C function bindings, garbage collection
ECL: Embeddable Common Lisp
Ferret: Lisp compiled down to C++, with optional garbage collection runtime
The following I believe have little or no activity, implying they are no longer supported:
Dale: "Lisp-flavoured C". Hasn't been touched in over two years
Bone Lisp: Lisp with no GC. Creator has abandoned it, but it still gets some attention
Thinlisp: No GC option available. Write your stuff in CL using the cushy SBCL environment, then compile down to C for good performance
Compared to C-mera
The most similar thing to Cakelisp is C-mera. I was not aware of it until after I got a good ways into the project. I will be forging ahead with my own version, which has the following features C-mera lacks (to my limited knowledge):
Automatic header file generation
Powerful mapping file for debugging, error reporting, etc. on the source code, not just the generated code
Scope-aware generators. You can make the same generator work in multiple contexts (at module vs. body vs. expression scopes)
Intended to support more than "just" code generation, e.g. code to support hot-reloading and runtime type information will be created
I will likely add some global environment that will be modifiable by any modules in the project. This is useful for things like automatic "command" function generation with project-wide scope
Features C-mera has that Cakelisp doesn't:
Access to Common Lisp macros, which is a huge swath of useful code generators
Support for generating other languages. At this point, the C/C++ output is hardcoded, and would be a bit painful to change
Multiple contributors and years of refinement
It's done, and has proven itself useful
Almost definitely has a cleaner implementation
Implementation language pros and cons
Cakelisp is written in C/C++ while C-mera is written in Common Lisp.
This is good and bad: the advanages of writing it in C/C++ are:
It is fast; no garbage collection pauses etc. to deal with. This might not actually be the case if intermediate compilation and loading of generators and macros ends up being slow
C++ is what I'm most familiar with; it would've taken me much longer in Common Lisp simply because I'm inexperienced in it
Cakelisp does not depend on a runtime (except for the C runtime), which means it would be possible to integrate the Cakelisp compiler into the project being compiled itself. This could be pretty handy for in-process self-modification thanks to the hot-reloading features
Macros and generators can be written in the same language being generated (and in Cakelisp, of course, because Cakelisp itself can load its own generated code to expand itself)
The bad things:
There's no macro-writing library to draw from (macros which help write macros)
Like previously mentioned, macros and generators need to be converted to C/C++ and compiled by an external compiler to be executed, whereas Common Lisp would make this whole process much easier by natively supporting macro code generation and evaluation