Bookmarks

Inference in Agda

Agda is a wonderful language and its unification engines are exemplary, practical, improve over time and work predictably well.

An MLIR Dialect for Distributed Heterogeneous Computing

Welcome to the home page of the 46th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2025)! PLDI is the premier forum in the field of programming languages and programming systems research, covering the areas of design, implementation, theory, applications, and performance. PLDI 2025 will be held in-person at the Westin Josun Seoul in Seoul, South Korea. The main PLDI conference will be held Wednesday, 18 June through Friday, 20 June. Workshops and tutorials were held on Monday, 16 June and Tuesday, 17 June. PLDI 2025 Travel Guide Nuno Lopes has kindly writte ...

User Guide for NVPTX Back-end

To support GPU programming, the NVPTX back-end supports a subset of LLVM IR along with a defined set of conventions used to represent GPU programming concepts.

Notes/Primer on Clang Compiler Frontend (1) : Introduction and Architecture

Notes/Primer on Clang Compiler Frontend: Introduction and Architecture These are my notes on chapters 1 & 2 of the Clang Compiler Frontend by Ivan Murashko. The book is focused on teaching the fundamentals of LLVM to C++ engineers who are interested in learning about compilers to optimize their daily workflow by enhancing their code quality and overall development process. (I’ve referened this book extensively, and a lot of the snippets here are from this book.

Template Haskell

Intuitively Template Haskell provides new language features that allow us to convert back and forth between concrete syntax, i. e.

A friendly introduction to machine learning compilers and optimizers

[Twitter thread, Hacker News discussion]

tt-mlir documentation

The following document provides an overview of the TT-MLIR project, with a focus on the technical specifications of an MLIR-based compiler stack. So what exactly is an MLIR-based compiler stack?

Tutorials

Multi-Level IR Compiler Framework

Yizhou Shan's Home Page

This paper has a really nice Intro, pay close attention to how they lay out the storyline.

Tilde, my LLVM alternative

I'm Yasser and I've made it my mission to produce an alternative to LLVM, the current king of compiler backend libraries.

A WebAssembly compiler that fits in a tweet

Starting with a 192-byte one-liner that implements a Reverse Polish Notation arithmetic compiler, we'll work backward to transform it into readable JavaScript by removing one code golf trick at a time

Why Futhark?

A high-performance and high-level purely functional data-parallel array programming language that can execute on the GPU and CPU.

The Double-E Infix Expression Parsing Method

Topic in Programming Models

The categorical abstract machine

The Cartesian closed categories have been shown by several authors to provide the right framework of the model theory of λ-calculus. The second author…

How LLVM Optimizes a Function

In some compilers the IR format remains fixed throughout the optimization pipeline, in others the format or semantics change.

Tell the Compiler What You Know

Compilers a lot of times use magic to uncover hidden mysteries of your program and optimize it aggressively.

Compiler Optimization in a Language you Can Understand

In this article, I'll explain compiler optimizations through a series of examples, focusing on what compilers do.

How Target-Independent is Your IR?

An esoteric exploration on the target independence of compiler IRs.

`zig cc`: a Powerful Drop-In Replacement for GCC/Clang

If you have heard of Zig before, you may know it as a promising new programming language which is ambitiously trying to overthrow C as the de-facto systems language.

Resources for Amateur Compiler Writers

I know complete pans of the literature are left out, but this is a page for amateur compiler writers. Anything that I did not find practical is not listed here.

How to Compile Your Language

The guide also covers how to create a platform-specific executable with the help of the LLVM compiler infrastructure, which all of the previously mentioned languages use for the same purpose.

bytecode interpreters for tiny computers

I've previously come to the conclusion that there's little reason for using bytecode in the modern world, except in order to get more compact code, for which it can be very effective.

Google’s Fully Homomorphic Encryption Compiler — A Primer

Back in May of 2022 I transferred teams at Google to work on Fully Homomorphic Encryption (newsletter announcement). Since then I’ve been working on a variety of projects in the space, includ…

Will I be able to access proprietary platform APIs (e.g. Android / iOS)?

The kind of binary format being considered for WebAssembly can be natively decoded much faster than JavaScript can be parsed (experiments show more than 20× faster).

The future of Clang-based tooling

By Peter Goodman Clang is a marvelous compiler; it’s a compiler’s compiler! But it isn’t a toolsmith’s compiler. As a toolsmith, my ideal compiler would be an open book, allowing me to get to…

QBE vs LLVM

QBE and LLVM are both compiler backends, but QBE is a smaller, more accessible project aimed at amateur language designers. While LLVM is feature-rich and complex, QBE focuses on simplicity and efficiency, making it easier to use for quick projects. QBE provides straightforward operations and a cleaner intermediate language, reducing the complexity often found in LLVM.

Compiler Backend

The QBE compiler backend is designed to be a compact yet high-performance C embeddable backend that prioritizes correctness, simplicity, and user-friendliness. It compiles on various x64 operating systems and boasts features like IEEE floating point support, SSA-based intermediate language, and quick compilation times. While currently limited to x64 platforms, plans include ARM support and further enhancements. The backend has been successfully utilized in various projects, showcasing its adaptability and effectiveness in compiler development.

Implementing interactive languages

Implementing an interactive language requires considering both compile-time and run-time performance. Traditional switch-based bytecode interpreters are easy to implement but have lower run-time performance compared to optimizing compilers. A sweet spot in performance can be found by aiming for combined compile-time and run-time performance within a certain range. Various options for implementing fast interpreters, existing compilers like LLVM and Cranelift, custom compilers, and using WebAssembly as a backend are discussed. The idea of having two backends for a language to support quick startup and aggressive optimization is also explored. There are still many unknowns and further research is needed to determine the feasibility and performance of different approaches.

How a Zig IDE Could Work Feb 10, 2023

The author discusses how to build an Integrated Development Environment (IDE) for the Zig programming language, which has unique features like a simple syntax but also complex compile-time evaluation. The IDE needs to handle incomplete code and provide immediate feedback while managing rapid code changes. The post explores various strategies for efficiently processing code, such as using abstract interpretation and optimizing compilation to focus only on necessary parts of the codebase.

Too Fast, Too Megamorphic: what influences method call performance in Java?

The performance of method calls in Java can be improved through techniques like inlining and using inline caches. Monomorphic calls, where only one method can be invoked, are the fastest, while bimorphic and megamorphic calls are slower due to increased lookup costs. The study highlights that simply adding the "final" keyword or overriding methods does not significantly enhance performance.

The Black Magic of (Java) Method Dispatch

The content shows code execution percentages for different operations within a program. It includes instructions for handling different coders, with comparisons and jumps based on coder values. The code includes sections like the main entry point, epilogue, handling other coders, and specific coder cases like Coder1 and Coder2.

Resources for Building Programming Languages

The article shares resources for learning how to create programming languages, focusing on Rust and C. It highlights the book "Crafting Interpreters," which provides practical insights into building interpreters using different programming approaches. The author also discusses their personal experience building a language and the tools they've found helpful, like LLVM and Cranelift.

Running the “Reflections on Trusting Trust” Compiler Posted on Wednesday, October 25, 2023.

The text discusses how to modify a C compiler to insert a backdoor into a program without leaving traces in the source code. It explains that the backdoor can be detected because the compiler's size increases each time it compiles itself. Finally, it highlights the importance of using trusted compilers to prevent hidden backdoors in modern software development.

CompilerTalkFinal

The content discusses various compilers and their features, including Clang, GCC, V8, CakeML, Chez Scheme, and more. It also touches on the history of interpreters and compilers, with examples like ENIAC and the first compiler developed by Grace Hopper. Different approaches to compilation and interpretation are highlighted, showcasing the evolution of compiler technology.

Graydon Hoare: 21 compilers and 3 orders of magnitude in 60 minutes

Graydon Hoare's talk explains different approaches to building compilers, from traditional giants to more efficient variants. He highlights the importance of using compiler-friendly languages and theory-driven meta-languages. The presentation covers key concepts like sophisticated partial evaluation and implementing compilers directly by hand.

Baby Steps to a C Compiler

Writing a simple compiler can help you understand how computers work. Start with a minimal project that compiles a small subset of a language, and then gradually add more features. This approach makes learning about compilers and programming enjoyable and rewarding.

Crafting an Interpreter in Zig - part 1

The author is learning Zig by implementing an interpreter for the Lox programming language, inspired by the book "Crafting Interpreters." They are documenting their journey, focusing on interesting aspects of Zig and how it differs from C. So far, they have enjoyed the process, particularly the simplicity and power of Zig's generic programming.

Aro - a C compiler

Aro is a C compiler created as an alternative to Zig's compiler. It includes the aro module for the compiler and a language-agnostic aro_backend module for translating code into machine code. Aro uses self-hosted backends from the Zig compiler for optimization.

Programming languages resources

This page is a collection of the author's favorite resources for people getting started writing programming languages. The resources cover various aspects such as compilers, runtimes, runtime optimization, pointer tagging, JIT compilers, assembler libraries, and interesting tools. The author also mentions topics they want to write about in the future and papers they want to read. The page is meant to be a helpful reference for those interested in programming language implementation.

compiler_construction

Building a compiler can be straightforward by breaking the development into small steps and using Scheme as the implementation language. The tutorial focuses on translating a subset of Scheme to assembly code, with a step-by-step approach to achieve a fully working compiler. Testing and refining the compiler incrementally leads to a powerful tool capable of compiling an interactive evaluator.

Introduction to Compilers and Language Design

A compiler translates high-level code to lower-level code, and building one is a common project in computer science education. This book provides a beginner-friendly guide to building a compiler for a C-like language, suitable for undergraduates with programming experience. The author offers free online access to the textbook and related code resources, with options to purchase a physical copy.

Compiler Optimizations Are Hard Because They Forget

Compiler optimizations involve breaking down complex changes into smaller, more manageable steps to improve code efficiency. However, as more optimizations are added, the potential for errors and missed opportunities increases, making it challenging to maintain optimal performance. Compilers struggle with balancing aggressive optimizations while preserving correct program behavior, highlighting the complexity and difficulties inherent in optimizing compilers.

Writing a C Compiler, Part 1

This text is about creating a C compiler in multiple stages, starting with lexing, parsing, and code generation. The process involves breaking down the source code, building an abstract syntax tree, and generating x86 assembly code. The compiler will handle simple programs with a single main function and a return statement.

GitHub - DoctorWkt/acwj: A Compiler Writing Journey

This GitHub repository documents the author's journey to create a self-compiling compiler for a subset of the C language. The author shares steps taken and explanations to help others follow along practically. The author credits Nils M Holm's SubC compiler for inspiration and differentiates their code with separate licensing.

A new JIT engine for PHP-8.4/9

A new JIT engine for PHP is being developed, improving performance and simplifying development. The engine will be included in the next major PHP version, potentially PHP 9.0. The new JIT engine generates a single Intermediate Representation (IR), eliminating the need to support assembler code for different CPUs.

Compiling tree transforms to operate on packed representations

The article explains how tree traversals in programming can be optimized by compiling them to work on serialized tree structures without using pointers. This approach can make programs run significantly faster on current x86 architectures. The authors developed a prototype compiler for a functional language that generates efficient code for traversing trees using packed data representations.

KHM+15

The text discusses a formal C memory model that supports integer-pointer casts, essential for low-level C programming. It proposes a quasi-concrete memory model that allows standard compiler optimizations while fully supporting integer-pointer casts. This model helps verify programs and optimizations that are challenging to validate with integer-pointer casts.

Learning LLVM (Part-1) - Writing a simple LLVM pass

This text introduces learning about LLVM and writing LLVM passes, which are used for transforming or analyzing a program's intermediate representation. LLVM offers a versatile compiler infrastructure with modules like the frontend, middle-end, and backend for optimizing and generating machine-specific code. By understanding LLVM concepts and pass managers, developers can create efficient passes for tasks like performance optimization and code analysis.

Zig-style generics are not well-suited for most languages

Zig-style generics, like those in C++, may not work well for all languages due to limitations in compiler support and type inference. Armchair suggestions about adopting Zig-style generics in other languages may overlook these challenges. The flexibility and metaprogramming capabilities in Zig may not easily translate to other statically-typed languages.

Manually linking Rust binaries to support out-of-tree LLVM passes

LLVM is a compiler infrastructure used by frontends like rustc to generate machine code. To add custom LLVM passes to a Rust binary, extra flags can be used during compilation to produce LLVM-IR and then link the binary properly using LLVM tools. By understanding how Rust's static libraries work and leveraging cargo for dependency management, custom LLVM passes can be integrated into Rust binaries efficiently.

The Rust Reference

The Rust compiler can generate different types of output artifacts, such as runnable executables, Rust libraries, dynamic libraries, and static system libraries. Dependencies between crates can be linked in various formats, such as rlib and dynamic library formats, following specific rules set by the compiler. Understanding how to specify output formats like --crate-type=bin or --crate-type=lib can help control the compilation process for Rust crates, while also considering options for linking C runtimes dynamically or statically based on target features.

Rust Compiler Development Guide

The Rust compiler processes and transforms your code for compilation. It uses different stages like lexing, parsing, and abstract syntax tree lowering. The compiler aims for correctness, performance, and supporting incremental compilation.

How to speed up the Rust compiler one last time

The author at Mozilla is concluding their work on speeding up the Rust compiler after several years of dedicated effort. They wrote multiple blog posts detailing their performance optimizations and shared valuable lessons learned from the process. The author expressed gratitude to those who supported their work and highlighted the importance of ongoing contributions to Rust's development.

How to speed up the Rust compiler in March 2024

In March 2024, updates on the Rust compiler's performance highlighted several key improvements. Changes like using a single codegen unit, marking Debug::fmt methods with #[inline], introducing a cache, and upgrading LLVM versions led to notable reductions in wall-time, binary size, and hash table lookups. Additionally, the availability of the Cranelift codegen backend for x86-64/Linux and ARM/Linux offers an alternative for faster compile times. While the author didn't contribute to speed improvements this time, overall performance from August 2023 to March 2024 showed reductions in wall-time, peak memory usage, and binary size, indicating steady progress in enhancing the Rust compiler's efficiency.

Do We Really Need A Link Step?

The author questions the need for a link step in native-code compilation for faster performance. They propose a "zero-link" approach where compilers directly write object code into the final executable file. This method could improve efficiency by avoiding unnecessary object files and incorporating symbol resolution within the executable itself.

jamiebuilds/the-super-tiny-compiler: :snowman: Possibly the smallest compiler ever

The Super Tiny Compiler is a simplified example of a modern compiler using easy-to-read JavaScript. It helps you understand how compilers work from start to finish. Compilers play a big role in the tools we use daily.

resume.txt

Andrew Kelley is a programmer with 16 years of experience in software development and a passion for open-source projects. He has worked on various music-related software like the Genesis DAW and libgroove, contributing patches to libav and ffmpeg. Additionally, he has experience in low-level systems, custom algorithm creation, and designing user interfaces.

Indices and tables

CompilerGym is a library for reinforcement learning in compiler tasks. It helps ML researchers work on optimization problems and allows system developers to create new tasks for ML research. The goal is to use ML to make compilers faster.

448997590_1496256481254967_2304975057370160015_n

The LLM Compiler is a suite of pre-trained models designed for code optimization tasks, based on Code Llama. It has been trained on a large corpus of LLVM-IR and assembly code to enhance compiler behavior understanding. The release of LLM Compiler aims to support further research in compiler optimization for both academia and industry.

Dioxus Labs + “High-level Rust”

An article criticized Rust's gamedev hype, but its popularity stems from meeting modern programming needs like speed and safety. Efforts are underway to enhance Rust's capabilities for various industries and improve compile times significantly. Proposed enhancements include incremental linking, parallel frontend, and macro expansion caching to make Rust more efficient for developers.

Zig Parser

The Zig Parser is a crucial part of the Zig compiler internals, responsible for constructing an abstract syntax tree from a stream of tokens. The parser uses a struct called Parser to manage the internal state of the parse operation, accumulating errors and building up AST nodes. Understanding the structure of an AST node and the data pattern is essential for comprehending how the parser works and the subsequent stages of the compiler. The AST node data is stored in various locations such as the token stream, the node list, and the extra data list, with specific structures and indexes used to access information about AST nodes like function declarations and prototypes.

What Is The Minimal Set Of Optimizations Needed For Zero-Cost Abstraction?

Rust and C++ offer "zero-cost abstractions" where high-level code compiles to low-level code without added runtime overhead, but enabling necessary compiler optimizations can slow down compilation and impact debugging. The challenge is to find the minimal set of optimizations that maintain zero-cost abstractions while improving build speed and debug information quality. Balancing fast debuggable builds with zero-cost abstractions is crucial for performance and developer experience in languages like Rust and C++.

Why is Python slow

Python's performance issues stem from spending most time in the C runtime, rather than the Python code itself. Pyston focuses on speeding up the C code to improve performance. Suggestions to improve Python's speed by using other JIT techniques overlook the fundamental issue of optimizing C code.

What Every C Programmer Should Know About Undefined Behavior #1/3

This blog post explains that many seemingly reasonable things in C actually have undefined behavior, leading to common bugs in programs. Undefined behavior in C allows for optimizations that improve performance but can result in unexpected outcomes like formatting your hard drive. Understanding undefined behavior is crucial for C programmers to prevent potential issues and improve code efficiency.

zackoverflow

Zack, the author, enjoys building things and delving into the inner workings of systems and computers for dopamine. He works on the Bun JavaScript runtime and creates music when not coding. Zack invites anyone to chat through his open calendar link.

Principles of compiler design

This text is about a book on compiler design principles. The book is authored by Jeffrey D. Ullman and contains 604 pages. It includes bibliographical references, but access to the EPUB and PDF versions is not available.

How should I read type system notation?

A type system in programming languages follows rules for expressions and types. Typing rules are written as relationships between expressions and their types for checking and inferring types. Contexts are used to keep track of variable types in type judgments.

A decade of developing a programming language

The author spent a decade developing the programming language Inko, transitioning from gradual to static typing and using Rust for the compiler. Recommendations include avoiding gradual typing, self-hosting compilers, and focusing on functionality over performance when building a new language. Building a language for long-term use is a time-consuming process that requires prioritizing user needs over technical complexities.

essentials-of-compilation

The text discusses the implementation of compilers for different programming languages, covering topics such as syntax definitions, interpreter extensions, and x86 assembly translation. It emphasizes simplifying the compiler process for readers by using a straightforward language and providing step-by-step guidance on compiler development. Additionally, it introduces new language features like Booleans, conditionals, and tuples, expanding the capabilities of the compilers being built.

PRACTICAL COMPILER CONSTRUCTION

"Practical Compiler Construction" is a textbook on writing compilers with annotated source code. The second edition is now available in print with improvements and bug fixes. The book covers compiler construction concepts and advanced techniques for optimizing code.

MLIR: A Compiler Infrastructure for the End of Moore's Law

MLIR is a versatile compiler infrastructure designed to address software fragmentation and improve compilation for different hardware. It aims to reduce the cost of building domain-specific compilers and facilitate the connection of existing compilers. MLIR offers a standardized approach to code generation and optimization across various application domains and hardware targets.

MLIR — Getting Started

The text is a guide titled "MLIR — Getting Started" by Math ∩ Programming available on www.jeremykun.com.

A Gentle Introduction to LLVM IR

Learning LLVM IR can be beneficial for generalist working programmers to understand what their compiler is doing to create highly optimized code. LLVM IR is well-documented and can be treated as a slightly weird programming language. It is strongly typed and requires explicit type annotations. LLVM IR is a static single assignment form (SSA) IR and has properties that make optimizations simpler to write. It supports control flow operations, arithmetic instructions for different types, and memory operations. There are also LLVM intrinsics available for specific functions. However, some parts of LLVM's semantics, such as undefined behavior and pointer provenance, can be challenging to navigate.

Ever wanted to make your own programming language or wondered how they are designed and built?

Crafting Interpreters is a book that provides everything you need to create your own programming language. It covers both high-level concepts like parsing and semantics, as well as technical details such as bytecode representation and garbage collection. The book guides you through building a language from scratch, including features like dynamic typing, lexical scope, functions, classes, and inheritance. It is available in multiple formats, including print, ebook, and online for free. The author, Robert Nystrom, is an experienced language developer who currently works at Google on the Dart language.

Subcategories