Ludwig - Bookmarks

Timeline

June 2025

1 bookmarks

May 3, 2025

The ultimate guide to training LLM on large GPU Clusters

April 5, 2025

Faking ADTs and GADTs in Languages That Shouldn't Have Them

April 1, 2025

Haskell is the world’s best programming language, but let’s face the harsh reality that a lot of times in life you’ll have to write in other programming languages. But alas you have been fully Haskell-brained and lost all ability to program unless it is type-directed, you don’t even know how to start writing a program without imagining its shape as a type first. Well, fear not. The foundational theory behind Algebraic Data Types and Generalized Algebraic Data Types (ADTs and GADTs) are so fundamental that they’ll fit (somewhat) seamlessly into whatever language you’re forced to write. After all, if they can fit profunctor optics in Microsoft’s Java code, the sky’s the limit! This is an “April Fools” joke in the tradition of my previous one in some of these ways that we are going to twist these other languages might seem unconventional or possibly ill-advised… but also the title is definitely a lie: these languages definitely should have them! :D

March 2025

36 bookmarks

Accelerate

March 29, 2025

Accelerate is a language for array-based computations, designed to exploit massive parallelism.

Ok Rust, You Really Have a Readability Problem

March 29, 2025

Rust is safe. Rust is fast. Rust is powerful. And Rust is… sometimes completely unreadable.

Circuit Tracing: Revealing Computational Graphs in Language Models

March 29, 2025

Deep learning models produce their outputs using a series of transformations distributed across many computational units (artificial “neurons”).

Things that go wrong with disk IO

March 29, 2025

Things that go wrong with disk IO

Analyzing Modern NVIDIA GPU cores

March 29, 2025

GPUs are the most popular platform for accelerating HPC workloads, such as artificial intelligence and science simulations. However, most microarchitectural research in academia relies on GPU core pipeline designs based on architectures that are more than 15 years old. This paper reverse engineers modern NVIDIA GPU cores, unveiling many key aspects of its design and explaining how GPUs leverage hardware-compiler techniques where the compiler guides hardware during execution. In particular, it reveals how the issue logic works including the policy of the issue scheduler, the structure of the register file and its associated cache, and multiple features of the memory pipeline. Moreover, it analyses how a simple instruction prefetcher based on a stream buffer fits well with modern NVIDIA GPUs and is likely to be used. Furthermore, we investigate the impact of the register file cache and the number of register file read ports on both simulation accuracy and performance. By modeling all these new discovered microarchitectural details, we achieve 18.24% lower mean absolute percentage error (MAPE) in execution cycles than previous state-of-the-art simulators, resulting in an average of 13.98% MAPE with respect to real hardware (NVIDIA RTX A6000). Also, we demonstrate that this new model stands for other NVIDIA architectures, such as Turing. Finally, we show that the software-based dependence management mechanism included in modern NVIDIA GPUs outperforms a hardware mechanism based on scoreboards in terms of performance and area.

tt-metal/tech_reports/AdvancedPerformanceOptimizationsForModels/AdvancedPerformanceOptimizationsForModels.md at main · tenstorrent/tt-metal · GitHub

March 29, 2025

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model. - tenstorrent/tt-metal

Move Slow and Fix Things

March 28, 2025

Growing up as a kid in rural Bavaria, I always …

Why is Yazi fast?

March 28, 2025

This article assumes that you have already used Yazi and are familiar with most of its features.

User Guide for NVPTX Back-end

March 28, 2025

To support GPU programming, the NVPTX back-end supports a subset of LLVM IR along with a defined set of conventions used to represent GPU programming concepts.

An AnandTech Interview with Jim Keller: 'The Laziest Person at Tesla'

March 27, 2025

I've spoken about Jim Keller many times on AnandTech.

Notes/Primer on Clang Compiler Frontend (1) : Introduction and Architecture

March 25, 2025

Notes/Primer on Clang Compiler Frontend: Introduction and Architecture These are my notes on chapters 1 & 2 of the Clang Compiler Frontend by Ivan Murashko. The book is focused on teaching the fundamentals of LLVM to C++ engineers who are interested in learning about compilers to optimize their daily workflow by enhancing their code quality and overall development process. (I’ve referened this book extensively, and a lot of the snippets here are from this book.

Implementation of simple microprocessor using verilog

March 25, 2025

I am trying to make a simple microprocessor in verilog as a way to understand verilog and assembly at the same time. I am not sure if I am implementing what I think of microprocessors well enough ...

learn-fpga/FemtoRV/TUTORIALS/FROM_BLINKER_TO_RISCV/README.md at master · BrunoLevy/learn-fpga · GitHub

March 24, 2025

Learning FPGA, yosys, nextpnr, and RISC-V . Contribute to BrunoLevy/learn-fpga development by creating an account on GitHub.

Why async Rust?

March 24, 2025

I genuinely can’t understand how anybody could look at the mess that’s Rust’s async and think that it was a good design for a language that already had the reputation of being very complicated to write.

Softmax Attention is a Fluke

March 24, 2025

Calibrated AttentionCalibrated Attention NanoGPTAttention is the magic ingredient of modern neural networks. It is the core of what has launched performant language models into the spotlight starting with GPT, and since then, it has extended its hands across all modalities.There are a number of desirable properties that make attention a first-class building block. Namely: • It handles variable sequence lengths with ease • It allows for a global receptive field without needing to scale parameters

Transformers Laid Out

March 23, 2025

I have encountered that there are mainly three types of blogs/videos/tutorials talking about transformers

Template Haskell

March 22, 2025

Intuitively Template Haskell provides new language features that allow us to convert back and forth between concrete syntax, i. e.

A friendly introduction to machine learning compilers and optimizers

March 18, 2025

[Twitter thread, Hacker News discussion]

Comments on Source

March 18, 2025

The section of the wiki allows anyone to document, explain, post questions, or make comments on the Lua source code. You may link to [1] or paste the code in question.

Bloom’s 3 Stages of Talent Development

March 18, 2025

First, fun and exciting playtime. Then, intense and strenuous skill development. Finally, developing one’s individual style while pushing the boundaries of the field.

Russell’s Paradox and Possible Solutions

March 18, 2025

The origins of set theory can be traced back to a Bohemian priest, Bernhard Bolzano (1781-1848), who was a professor of religion at the University of Prague.

The Making of Python

March 17, 2025

Guido van Rossum is the author of Python, an interpreted, interactive object-oriented programming language.

tt-metal/METALIUM_GUIDE.md at main · tenstorrent/tt-metal · GitHub

March 17, 2025

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model. - tenstorrent/tt-metal

Scoping out the Tenstorrent Wormhole

March 17, 2025

The Tenstorrent Wormhole n300s PCIe accelerator board is available for purchase, featuring 672 RISC-V cores driving 466 TFLOP/s of FP8 matmul.

What’s the (floating) Point of all these data types? A (not so) brief overview of the history and usage of datatypes within the wide world of computation

March 17, 2025

This presentation delves into the fascinating and sometimes aggravating world of numerical data types, exploring the evolution, strengths, and weaknesses of decimal, fixed point, floating point, and shared exponent formats over the past 70 years.

Physics of language models

March 17, 2025

Many asked about collaborations (details are in FAQ). Short answer: unless you're from Meta and willing to work with us in your spare time (20+ hrs/week), or you're an early-year PhD from UCB/NYU/CMU/UW (but application ddl was Jan 10, 2025). Citation request: I'm delighted to know that multiple

Tenstorrent first thoughts

March 17, 2025

I've looked into alternative AI accelerators to continue my saga of running GGML on lower power-consumption hardware. The most promising - and the only one that ever replied to my emails - was Tenstorrent. This post is me deeply thinking about if buying their hardware for development is a good inve ...

Neural Networks, Manifolds, and Topology

March 9, 2025

However, there remain a number of concerns about them. One is that it can be quite challenging to understand what a neural network is really doing.

Attention from Beginners Point of View

March 9, 2025

Transformers are a type of neural network architecture which is popularly used for text generations, machine translations, etc.

(How) Do Language Models Track State?

March 9, 2025

Transformer language models (LMs) exhibit behaviors -- from storytelling to code generation -- that appear to require tracking the unobserved state of an evolving world. How do they do so? We study state tracking in LMs trained or fine-tuned to compose permutations (i.e., to compute the order of a set of objects after a sequence of swaps). Despite the simple algebraic structure of this problem, many other tasks (e.g., simulation of finite automata and evaluation of boolean expressions) can be reduced to permutation composition, making it a natural model for state tracking in general. We show that LMs consistently learn one of two state tracking mechanisms for this task. The first closely resembles the "associative scan" construction used in recent theoretical work by Liu et al. (2023) and Merrill et al. (2024). The second uses an easy-to-compute feature (permutation parity) to partially prune the space of outputs, then refines this with an associative scan. The two mechanisms exhibit markedly different robustness properties, and we show how to steer LMs toward one or the other with intermediate training tasks that encourage or suppress the heuristics. Our results demonstrate that transformer LMs, whether pretrained or fine-tuned, can learn to implement efficient and interpretable state tracking mechanisms, and the emergence of these mechanisms can be predicted and controlled.

Why Attention Is All You NeedWhy Attention Is All You Need

March 9, 2025

The Transformer architecture introduced in this paper was a major breakthrough in sequence transduction methodologies, particularly within neural machine translation (NMT) and broader natural language processing (NLP).

CFD Python: 12 steps to Navier-Stokes

March 7, 2025

We announce the public release of online educational materials for self-learners of CFD using IPython Notebooks: the CFD Python Class!

tt-mlir documentation

March 6, 2025

The following document provides an overview of the TT-MLIR project, with a focus on the technical specifications of an MLIR-based compiler stack. So what exactly is an MLIR-based compiler stack?

Tutorials

March 6, 2025

Multi-Level IR Compiler Framework

Yizhou Shan's Home Page

March 6, 2025

This paper has a really nice Intro, pay close attention to how they lay out the storyline.

Crossing the uncanny valley ofconversational voice

March 1, 2025

At Sesame, our goal is to achieve “voice presence”—the magical quality that makes spoken interactions feel real, understood, and valued.

February 17, 2025

Mastering LLM Techniques: Evaluation

February 15, 2025

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems.

Mastering LLM Inference Techniques: Inference Optimization

February 15, 2025

Learn about the most pressing challenges in LLM inference, along with some practical solutions.

Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling

February 15, 2025

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning or long…

The high-return activity of raising others’ aspirations

February 12, 2025

Yesterday I had lunch with a former Ph.D student of mine, who is now highly successful and tenured at a very good school. I was reminded that, over twenty years ago, I was Graduate Director of Admissions. One of my favorite strategies was to take strong candidates who applied for Masters and also offer them […]

January 2025

14 bookmarks

Build Your Own Text Editor

January 27, 2025

The text editor is antirez’s kilo, with some changes.

Tilde, my LLVM alternative

January 25, 2025

I'm Yasser and I've made it my mission to produce an alternative to LLVM, the current king of compiler backend libraries.

A WebAssembly compiler that fits in a tweet

January 25, 2025

Starting with a 192-byte one-liner that implements a Reverse Polish Notation arithmetic compiler, we'll work backward to transform it into readable JavaScript by removing one code golf trick at a time

Proof of correctness of data representation

January 25, 2025

Unnamed Document

January 25, 2025

Unveiling_DeepSeek.pdf

January 22, 2025

successful modifications since its inception, let alone large-scale validation.

Stating the problem in Lean

January 19, 2025

Note: this post was written for Lean 3; the latest version, Lean 4, is a very different language. Turn back the clock to 2009: a confused physics major newly infatuated with math and computer science, I enrolled in MATH 273: Numbers and Proofs at the University of Calgary. This wasn’t my first encounter with mathematical proof; in first-year calculus I’d mastered rote regurgitation of delta-epsilon proofs. Despite writing out several dozen, their meaning never progressed beyond a sort of incantation I can summon to this day (for every $ \epsilon > 0 $ there exists a $ \delta > 0 $ such that…).

DeepSeek-V3 Explained: A Deep Dive into the Next-Generation AI Model

January 18, 2025

Artificial Intelligence (AI) is advancing at an unprecedented pace, and the DeepSeek-V3 model is at the forefront of this revolution. As…

Foundations of Large Language Models

January 17, 2025

This is a book about large language models. As indicated by the title, it primarily focuses on foundational concepts rather than comprehensive coverage of all cutting-edge technologies. The book is structured into four main chapters, each exploring a key area: pre-training, generative models, prompting techniques, and alignment methods. It is intended for college students, professionals, and practitioners in natural language processing and related fields, and can serve as a reference for anyone interested in large language models.

Category Theory: Lecture Notes and Online Books

January 10, 2025

The links below are to various freely (and legitimately!) available online mathematical resources for those interested in category theory at an elementary/intermediate level. There is supplementary page, introductory readings for philosophers, for reading suggestions for those looking for the most accessible routes into category theory and/or links to philosophical discussions. A gentle introduction? My Category … Category Theory: Lecture Notes and Online Books Read More »

Why Futhark?

January 9, 2025

A high-performance and high-level purely functional data-parallel array programming language that can execute on the GPU and CPU.

Hesabım - Pozitif Teknoloji

January 6, 2025

Ödeme - Pozitif Teknoloji

January 6, 2025

*Lütfen açıklama kısmına sipariş numaranızı giriniz, Sipariş numarası yazılmayan havale işlemlerinde ki gecikmelerden firmamız sorumlu değildir.

Clear cache x app ios

January 4, 2025

Any way to delete the cache or app data on iphone? - RedditJul 26, 2023X app taking up 1.

December 2024

66 bookmarks

Bloom filters debunked: Dispelling 30 Years of bad math with Coq!

December 27, 2024

While conceptually simple, this feature actually requires more engineering effort than one would expect - in particular, tracking the set of known malicious URLs in a practical manner turns out to be somewhat difficult.

DeepSeek-V3/DeepSeek_V3.pdf at main · deepseek-ai/DeepSeek-V3

December 26, 2024

by Marcus Hutter and David Quarel and Elliot Catt

December 24, 2024

The book can be ordered from amazon. com / co.

Deepseek: The Quiet Giant Leading China’s AI Race

December 24, 2024

Annotated translation of its CEO's deepest interview

The Double-E Infix Expression Parsing Method

December 23, 2024

Topic in Programming Models

Demystifying Debuggers, Part 2: The Anatomy Of A Running Program

December 23, 2024

On the concepts involved in a running program. What happens, exactly, when you double click an executable file, or launch it from the command line, and it begins to execute?

Towards a Categorical Foundation of Deep Learning: A Survey

December 22, 2024

The unprecedented pace of machine learning research has lead to incredible advances, but also poses hard challenges. At present, the field lacks strong theoretical underpinnings, and many important achievements stem from ad hoc design choices which are hard to justify in principle and whose effectiveness often goes unexplained. Research debt is increasing and many papers are found not to be reproducible. This thesis is a survey that covers some recent work attempting to study machine learning categorically. Category theory is a branch of abstract mathematics that has found successful applications in many fields, both inside and outside mathematics. Acting as a lingua franca of mathematics and science, category theory might be able to give a unifying structure to the field of machine learning. This could solve some of the aforementioned problems. In this work, we mainly focus on the application of category theory to deep learning. Namely, we discuss the use of categorical optics to model gradient-based learning, the use of categorical algebras and integral transforms to link classical computer science to neural networks, the use of functors to link different layers of abstraction and preserve structure, and, finally, the use of string diagrams to provide detailed representations of neural network architectures.

Soft question: Deep learning and higher categories

December 22, 2024

Recently, I have stumbled upon certain articles and lecture videos that use category theory to explain certain aspects of machine learning or deep learning (e.g. Cats for AI and the paper An enriched

Algebraic Databases

December 22, 2024

Databases have been studied category-theoretically for decades. The database schema---whose purpose is to arrange high-level conceptual entities---is generally modeled as a category or sketch. The data itself, often called an instance, is generally modeled as a set-valued functor, assigning to each conceptual entity a set of examples. While mathematically elegant, these categorical models have typically struggled with representing concrete data such as integers or strings. In the present work, we propose an extension of the set-valued functor model, making use of multisorted algebraic theories (a.k.a. Lawvere theories) to incorporate concrete data in a principled way. This also allows constraints and queries to make use of operations on data, such as multiplication or comparison of numbers, helping to bridge the gap between traditional databases and programming languages. We also show how all of the components of our model---including schemas, instances, change-of-schema functors, and queries - fit into a single double categorical structure called a proarrow equipment (a.k.a. framed bicategory).

Categorical Databases

December 22, 2024

walter

December 22, 2024

FPGAs for Software Engineers 0: The Basics

December 22, 2024

A brief introduction to FPGAs, Verilog and simulation

Data-Oriented Design

December 19, 2024

Data-Oriented Design

A note about "The Humane Representation of Thought"

December 17, 2024

A year and a half ago, on a plane, I wrote An Ill-Advised Personal Note about "Media for Thinking the Unthinkable".

December 17, 2024

If you cannot find proofs, talk about them. Robert Reckhow with his advsior Stephen Cook famously started the formal study of the complexity of proofs with their 1979 paper. They were interested in…

Mastering Board Games by External and Internal Planning with Language Models

December 17, 2024

December 16, 2024

Geeks, MOPs, and sociopaths in subculture evolution

December 16, 2024

How muggles and sociopaths invade and undermine creative subcultures; and how to stop them.

Advanced programming languages

December 16, 2024

Students often ask for a recommendation on what language they should learn next.

December 9, 2024

Yann LeCun's Web pages at NYU

yataobian/awesome-ebm: Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning)

December 9, 2024

Collecting research materials on EBM/EBL (Energy Based Models, Energy Based Learning) - yataobian/awesome-ebm

TuringConf

Fastest contributed programs, grouped by programming language implementation

December 3, 2024

Charts showing benchmark program performance grouped by implementation language.

Haskell as fast as C: working at a high altitude for low level performance

An Invitation to Applied Category Theory

December 3, 2024

Cambridge Core - Programming Languages and Applied Logic - An Invitation to Applied Category Theory

Introducing io_uring_spawn

December 2, 2024

The traditional mechanism for launching a program in a new process on Unix systems—forking and execing—has been with us for decades, but it is not really the most efficient of operations.

November 24, 2024

In this post we’re going to explore a surprising property of structured generation when working with Large Language Models (LLMs): generating structured output from an LLM can be significantly faster than generating unstructured text.

November 17, 2024

The CPU vendors have been trying for a lot of time to exploit as much parallelism as they can and the introduction of vector instructions is one way to go.

Tell the Compiler What You Know

November 17, 2024

Compilers a lot of times use magic to uncover hidden mysteries of your program and optimize it aggressively.

Compiler Optimization in a Language you Can Understand

November 17, 2024

In this article, I'll explain compiler optimizations through a series of examples, focusing on what compilers do.

How Target-Independent is Your IR?

November 17, 2024

An esoteric exploration on the target independence of compiler IRs.

Bibliopolis-Book-retypeset-1984

November 12, 2024

Numerical Recipes

November 11, 2024

We are Numerical Recipes, one of the oldest continuously operating sites on the Internet.

Unpacking Intuition

November 10, 2024

Can intuition be taught? The way in which faces are recognized, the structure of natural classes, and the architecture of intuition may all be instances of the same process. The conjecture that intuition is a species of recognition memory implies ...

For Beginners

November 9, 2024

Occasional writings about Haskell.

October 2024

6 bookmarks

September 2024

14 bookmarks

2305.20091

September 30, 2024

Humans in 4D: Reconstructing and Tracking Humans with Transformers

September 30, 2024

Join the discussion on this paper page

Part 2: Portable Executable Files

August 6, 2024

bytecode interpreters for tiny computers

August 4, 2024

I've previously come to the conclusion that there's little reason for using bytecode in the modern world, except in order to get more compact code, for which it can be very effective.

How I built zig-sqlite

August 4, 2024

When you prepare a statement zig-sqlite creates a brand new type only for this prepared statement.

The Hunt for the Missing Data Type

August 3, 2024

A (directed) graph is a set of nodes, connected by arrows (edges). The nodes and edges may contain data. Here are some graphs: All graphs made with graphviz (source) Graphs are ubiquitous in software engineering: Package dependencies form directed graphs, as do module imports. The internet is a graph of links between webpages. Model checkers analyze software by exploring the “state space” of all possible configurations.

Microfeatures I'd like to see in more languages

August 3, 2024

There are roughly three classes of language features: Features that the language is effectively designed around, such that you can't add it after the fact....

Google’s Fully Homomorphic Encryption Compiler — A Primer

August 2, 2024

Back in May of 2022 I transferred teams at Google to work on Fully Homomorphic Encryption (newsletter announcement). Since then I’ve been working on a variety of projects in the space, includ…

Will I be able to access proprietary platform APIs (e.g. Android / iOS)?

August 1, 2024

The kind of binary format being considered for WebAssembly can be natively decoded much faster than JavaScript can be parsed (experiments show more than 20× faster).

The future of Clang-based tooling

August 1, 2024

By Peter Goodman Clang is a marvelous compiler; it’s a compiler’s compiler! But it isn’t a toolsmith’s compiler. As a toolsmith, my ideal compiler would be an open book, allowing me to get to…

July 29, 2024

Optimizing subroutines in assembly language

July 29, 2024

Optimizing subroutines in assembly language involves various techniques such as using inline assembly in a C++ compiler, separating code using MMX registers from code using ST registers, and understanding different register sizes and memory operands. It is important to consider the use of instruction prefixes, intrinsic functions for vector operations, and accessing class and structure members efficiently. Additionally, preventing false dependences, aligning loop and subroutine entries, and optimizing instruction sizes can improve performance. However, it is crucial to note that these optimizations are processor-specific and may vary depending on the target platform.

Brian Robert Callahan

July 29, 2024

This blog post starts a series on creating programs that demystify how programs work. The first program is a disassembler that reads bytecode and converts it into assembly language, while a future post will cover creating an assembler. The disassembler uses a table of mnemonics and instruction sizes to print out the corresponding assembly instructions from bytecode.

QBE vs LLVM

July 29, 2024

QBE and LLVM are both compiler backends, but QBE is a smaller, more accessible project aimed at amateur language designers. While LLVM is feature-rich and complex, QBE focuses on simplicity and efficiency, making it easier to use for quick projects. QBE provides straightforward operations and a cleaner intermediate language, reducing the complexity often found in LLVM.

Recent presentations and papers

July 29, 2024

Andi Kleen's work focuses on improving Linux performance through various techniques like hardware monitoring and profiling. He has presented on topics such as lock elision, multi-core scalability, and error handling in the Linux kernel. His contributions include discussions on modern CPU performance, tools for Linux development, and enhancements for energy efficiency.

July 28, 2024

The content shows code execution percentages for different operations within a program. It includes instructions for handling different coders, with comparisons and jumps based on coder values. The code includes sections like the main entry point, epilogue, handling other coders, and specific coder cases like Coder1 and Coder2.

Why null sucks, even if it's checked

July 28, 2024

The article discusses the problems with using null in programming languages like Kotlin and C#, highlighting that null can lead to confusion and errors. It argues that null is not an extensible solution for representing absence of value and suggests using sum types or optional types instead. The author believes that languages should focus on improving optional types rather than trying to make null safer.

July 20, 2024

The paper discusses the development and influences of the C programming language, highlighting its creation at Bell Labs and transition from the B language. C's simplicity, efficiency, and widespread adoption across various platforms and architectures are emphasized, showcasing its enduring stability and usefulness in software development. Despite its quirks and historical origin, C has proven to be a powerful and versatile language for programmers worldwide.

Class Warfare

July 20, 2024

The text discusses a woman's conversation about company politics and self-interest, highlighting a zero-sum mentality within organizations. It emphasizes the need to shift away from this mindset and focus on creating value instead. The author suggests that combating this mentality starts with internal change and encourages individuals to reject zero-sum thinking for long-term benefit.

July 15, 2024

Pointers

July 15, 2024

Pointers in Zig allow variables to reference memory addresses. Understanding pointers helps manipulate memory effectively. Pointers are values that store memory addresses and can be nested within structures.

Emulator 101

July 15, 2024

A detailed, step by step guide to writing an emulator

Data Compression Explained

July 15, 2024

Data compression involves modeling and coding to reduce the size of data files. Modern compressors typically use arithmetic coding for efficient compression. Algorithms like Huffman coding and run-length encoding are commonly used to achieve better compression results.

Twitter's Recommendation Algorithm

July 14, 2024

Twitter uses a recommendation algorithm to select the top tweets for users' timelines. The algorithm is based on core models and features that extract information from tweet, user, and engagement data. The recommendation pipeline consists of three main stages: candidate sourcing, ranking, and applying heuristics and filters. Twitter uses both in-network and out-of-network sources to find relevant tweets, and employs embedding spaces to determine content similarity. The final step involves blending tweets with other non-tweet content before sending them to users' devices. The goal of Twitter's open source endeavor is to provide transparency to users about how the recommendation system works.

Programming languages resources

July 13, 2024

This page is a collection of the author's favorite resources for people getting started writing programming languages. The resources cover various aspects such as compilers, runtimes, runtime optimization, pointer tagging, JIT compilers, assembler libraries, and interesting tools. The author also mentions topics they want to write about in the future and papers they want to read. The page is meant to be a helpful reference for those interested in programming language implementation.

3D Math Primer for Graphics and Game Development

July 12, 2024

The book "3D Math Primer for Graphics and Game Development" is available to read for free on the gamemath.com website. It includes information about GDC talks, FAQs, and resources for the first edition of the book. The first edition, published in 2002, is described as high tech, but the author recommends reading the second edition instead, which is also available for free.

Welcome to OpenGL

July 12, 2024

This text is about learning modern OpenGL through an online book that covers basic, intermediate, and advanced knowledge with clear examples and practical concepts. The content is freely available online and in print, with the aim of providing a complete and easy-to-understand platform for graphics programming enthusiasts. Readers will learn core graphics aspects, useful techniques, and even create a small game based on the obtained OpenGL knowledge.

WebGPU Fundamentals

July 12, 2024

The text provides a collection of articles to help beginners learn the basics of WebGPU, covering topics like fundamentals, 3D math, lighting techniques, and compute shaders. It also includes information on optional features, data memory layout, transparency, performance, and resources for further learning. Readers can explore various aspects of WebGPU, including how it works, 2D and 3D techniques, and essential concepts like uniforms, textures, and storage buffers.

An opinionated beginner’s guide to Haskell in mid-2019

July 12, 2024

This guide is for beginners in Haskell or those transitioning from similar languages, offering advice on learning resources and tools. It emphasizes the importance of writing Haskell code, getting help online, choosing popular platforms, and sticking to the default Prelude. The guide also touches on application architecture, using records, debugging techniques, and the experimental nature of Haskell as both a research and industrial language.

Are tagged unions overrated?

July 12, 2024

The author discusses the limitations of tagged unions and pattern matching in language development, suggesting that they are overrated for implementing language ASTs and IRs. Despite the benefits of tagged unions, the complexity they add may not always justify their use, especially in cases where simpler alternatives like class hierarchies can offer similar functionality. The post also highlights the potential for enhancing pattern-matching capabilities in mainstream languages to improve code readability and maintainability.

C++ Core Guidelines

July 12, 2024

These guidelines aim to simplify and improve the safety of C++ code by recommending specific extensions and best practices. They focus on static type safety, resource management, and reducing the likelihood of errors or accidents. By following these guidelines, programmers can write more correct, safer code without sacrificing performance.

What every systems programmer should know about concurrency

July 11, 2024

The document delves into the complexities of concurrency for systems programmers, explaining the challenges of running multithreaded programs where code is optimized and executed in unexpected sequences. It covers fundamental concepts like atomicity, enforcing order in multithreaded programs, and memory orderings. The text emphasizes the importance of understanding how hardware, compilers, programming languages, and applications interact to create a sense of order in multithreaded programs. Key topics include atomic operations, read-modify-write operations, compare-and-swap mechanisms, and memory barriers in weakly-ordered hardware architectures.

compiler_construction

July 11, 2024

Building a compiler can be straightforward by breaking the development into small steps and using Scheme as the implementation language. The tutorial focuses on translating a subset of Scheme to assembly code, with a step-by-step approach to achieve a fully working compiler. Testing and refining the compiler incrementally leads to a powerful tool capable of compiling an interactive evaluator.

How do we tell truths that might hurt?

July 11, 2024

The document discusses the challenges of telling unpleasant truths and the conflict that arises when sharing these truths in the field of Computing Science. The author argues that remaining silent about these truths compromises the intellectual integrity of the field. The document also lists a number of truths related to programming languages and the use of language in computing systems. The author questions whether the field should continue to ignore these truths and urges for a change in attitude.

The next fifty years

July 11, 2024

The text discusses the future of computing science over the next fifty years, emphasizing the importance of simplicity and elegance in design to prevent complexity. It highlights the close connection between program design and proof design, suggesting that advancements in program design can impact general mathematics. The author encourages embracing the opportunity to simplify processes and design systems that rely on formal mathematics.

Recommender Systems: A Primer

July 10, 2024

Personalized recommendations have become a common feature of modern online services, including most major e-commerce sites, media platforms and social networks. Today, due to their high practical relevance, research in the area of recommender systems is flourishing more than ever. However, with the new application scenarios of recommender systems that we observe today, constantly new challenges arise as well, both in terms of algorithmic requirements and with respect to the evaluation of such systems. In this paper, we first provide an overview of the traditional formulation of the recommendation problem. We then review the classical algorithmic paradigms for item retrieval and ranking and elaborate how such systems can be evaluated. Afterwards, we discuss a number of recent developments in recommender systems research, including research on session-based recommendation, biases in recommender systems, and questions regarding the impact and value of recommender systems in practice.

http client in the standard library · Issue #2007 · ziglang/zig

July 10, 2024

The issue #2007 discusses the implementation of an HTTP client in Zig's standard library. Contributors debate the necessity and scope of including an HTTP client, considering factors like complexity and resource allocation. Ultimately, the HTTP client implementation was completed and closed as part of milestone 0.12.0.

Introduction to Compilers and Language Design

July 10, 2024

A compiler translates high-level code to lower-level code, and building one is a common project in computer science education. This book provides a beginner-friendly guide to building a compiler for a C-like language, suitable for undergraduates with programming experience. The author offers free online access to the textbook and related code resources, with options to purchase a physical copy.

Bare Metal Zig

July 10, 2024

The text discusses compiling a freestanding Zig binary to run on "bare metal" without relying on an operating system. It shows how to create a simple freestanding binary, make it multiboot compliant, and add custom console functionality for output. The process involves targeting specific architectures, handling linker warnings, and ultimately creating a bootable "kernel" to run on virtual machines like QEMU.

Comparing SIMD on x86-64 and arm64

July 10, 2024

The text compares SIMD implementations using SSE on x86-64 and Neon on arm64 processors, including emulating SSE on arm64 with Neon. It explores vectorized code performance using intrinsics, auto-vectorization, and ISPC, highlighting the efficiency of SSE and Neon implementations. The study shows how optimizing for SIMD instructions significantly boosts performance over scalar implementations in ray-box intersection tests.

Compiler Optimizations Are Hard Because They Forget

July 10, 2024

Compiler optimizations involve breaking down complex changes into smaller, more manageable steps to improve code efficiency. However, as more optimizations are added, the potential for errors and missed opportunities increases, making it challenging to maintain optimal performance. Compilers struggle with balancing aggressive optimizations while preserving correct program behavior, highlighting the complexity and difficulties inherent in optimizing compilers.

C Isn't A Programming Language Anymore

July 10, 2024

C is no longer just a programming language but a vital protocol for all languages. Parsing C headers is a complex task best left to C compilers. Maintaining ABI compatibility in C can be challenging and may require versioning schemes.

Writing a C Compiler, Part 1

July 9, 2024

This text is about creating a C compiler in multiple stages, starting with lexing, parsing, and code generation. The process involves breaking down the source code, building an abstract syntax tree, and generating x86 assembly code. The compiler will handle simple programs with a single main function and a return statement.

GitHub - DoctorWkt/acwj: A Compiler Writing Journey

July 9, 2024

This GitHub repository documents the author's journey to create a self-compiling compiler for a subset of the C language. The author shares steps taken and explanations to help others follow along practically. The author credits Nils M Holm's SubC compiler for inspiration and differentiates their code with separate licensing.

A new JIT engine for PHP-8.4/9

July 9, 2024

A new JIT engine for PHP is being developed, improving performance and simplifying development. The engine will be included in the next major PHP version, potentially PHP 9.0. The new JIT engine generates a single Intermediate Representation (IR), eliminating the need to support assembler code for different CPUs.

Unknown

July 9, 2024

Hardware prefetching in multicore processors can be too aggressive, wasting resources and impacting performance for co-running threads. Combining hardware and software prefetching can optimize performance by efficiently handling irregular memory accesses. A method described in Paper II offers a low-overhead framework for accurate software prefetching in applications with irregular access patterns.

Introduction 2016 NUMA Deep Dive Series

July 9, 2024

The 2016 NUMA Deep Dive Series by staroceans.org explores various aspects of computer architecture, focusing on NUMA systems and their optimization for performance. The series covers topics such as system architecture, cache coherency, memory optimization, and VMkernel constructs to help readers understand and improve their host design and management. The series aims to provide valuable insights for configuring and deploying dual socket systems using Intel Xeon processors, with a focus on enhancing overall platform performance.

von Neumann architecture - Wikipedia

July 9, 2024

The von Neumann architecture is a computer design with a processing unit, control unit, memory, and input/output mechanisms. It allows for instructions and data operations to be stored in memory, advancing computer technology from fixed-function machines like the ENIAC. This architecture was influenced by the work of Alan Turing and John von Neumann and has been widely used in the development of modern computers.

Compiling tree transforms to operate on packed representations

July 8, 2024

The article explains how tree traversals in programming can be optimized by compiling them to work on serialized tree structures without using pointers. This approach can make programs run significantly faster on current x86 architectures. The authors developed a prototype compiler for a functional language that generates efficient code for traversing trees using packed data representations.

Pipelines Support Vectorized, Point-Free, and Imperative Style

July 8, 2024

The text discusses how pipelines in the shell language support vectorized operations on collections and point-free style, where no data is explicitly mentioned. It also demonstrates how imperative code can be incorporated within pipelines for tasks like generating HTML tables. The unique features of pipelines include their ability to handle vectorized code, point-free composition, and integration of imperative instructions.

Entering text in the terminal is complicated

July 8, 2024

Entering text in the terminal can be challenging due to inconsistencies in how different programs handle text input. Some programs support basic features like arrow keys and history navigation, while others have custom input systems with advanced functionalities. Understanding the input mode of a program can help users navigate text editing more effectively in the terminal.

What happens when you start a process on Linux?

July 8, 2024

The process of starting a new program on Linux involves using the fork and exec system calls. Fork creates a clone of the current process, while exec replaces that clone with the new program to be executed. The new process inherits most attributes from its parent, with memory being shared through copy-on-write to optimize performance.

Debug your programs like they're closed source!

July 8, 2024

The author discusses debugging programs without looking at the source code by using system calls like open, execve, and write. System calls allow you to understand and monitor a program's behavior without needing access to its source code. By learning and utilizing system calls, you gain debugging superpowers that are platform-independent and useful for closed-source programs.

How I got better at debugging

July 8, 2024

Julia Evans shares her journey of improving her debugging skills through logical thinking, confidence, expanding knowledge, communication, and using tools like strace and tcpdump. By being systematic, confident, knowledgeable, and open to collaboration, she transformed debugging from a challenging task to an exciting learning opportunity. Her story emphasizes the importance of persistence, curiosity, and practical problem-solving in mastering the art of debugging.

Media Page Under Construction

July 8, 2024

Handmade Cities' media page is under construction, with some recordings missing. The videos from Handmade Boston 2023 have poor audio quality due to using a third-party A/V company. Freya's Masterclass footage was lost, and an abridged version will be shown at Dutch Game Day.

Infographics: Operation Costs in CPU Clock Cycles

July 8, 2024

The text discusses the operation costs in CPU clock cycles for different types of operations, including simple operations, floating-point operations, and vector operations. It highlights that memory involvement can significantly impact operation costs, with some operations taking as little as 1 CPU cycle. Different CPU architectures and types of operations can result in varying costs, with some operations requiring specialized CPU support to work efficiently.

Handles are the better pointers

July 8, 2024

The text discusses using 'index-handles' instead of raw or smart pointers for memory management in C and C++. It suggests centralizing memory management into systems, grouping items into arrays, and converting handles to pointers only when necessary. By following specific rules, such as not storing pointers and using handle-to-pointer conversion, memory safety and efficient memory usage can be maintained.

You're Not Sick of Programming

July 8, 2024

Many people feel tired of programming and dream of quitting for a more fulfilling career, like farming or traveling. However, the real issue might be frustration with office politics, lack of product vision, and burnout rather than a true dislike of programming. Taking a break or addressing these underlying problems could help rediscover the creative potential of programming.

Zig Bare Metal Programming on STM32F103 — Booting up

July 8, 2024

The text explains how to program the STM32F103 microcontroller using the Zig programming language. It covers topics such as memory layout, linker scripts, and compiling code for embedded systems. By following the provided instructions, readers can successfully compile and run their first embedded program on the microcontroller.

OWASP Top Ten

July 7, 2024

The OWASP Top 10 is a guide for developers to understand critical security risks in web applications. Companies are encouraged to follow this document to improve the security of their web applications. The 2021 update includes new categories and ranking changes based on testing data and industry feedback.

Introduction

July 7, 2024

The OWASP Cheat Sheet Series offers valuable security information on application security topics. Created by experts, these concise cheat sheets aim to provide easy-to-read security guidance. You can download the cheat sheets from this site and stay updated through the ATOM feed.

The Copenhagen Book

July 7, 2024

The Copenhagen Book is a free and open-source guide for implementing auth in web applications. It is community-maintained and can be used alongside the OWASP Cheat Sheet Series. Suggestions or concerns can be addressed by opening a new issue.

Undefined Behavior deserves a better reputation

July 6, 2024

Undefined Behavior is often viewed negatively, but it can be a valuable tool for language designers. It allows programmers to convey insights to the compiler for optimizations. Responsible use of Undefined Behavior can enhance language design and code performance.

KHM+15

July 6, 2024

The text discusses a formal C memory model that supports integer-pointer casts, essential for low-level C programming. It proposes a quasi-concrete memory model that allows standard compiler optimizations while fully supporting integer-pointer casts. This model helps verify programs and optimizations that are challenging to validate with integer-pointer casts.

Learning LLVM (Part-1) - Writing a simple LLVM pass

July 5, 2024

This text introduces learning about LLVM and writing LLVM passes, which are used for transforming or analyzing a program's intermediate representation. LLVM offers a versatile compiler infrastructure with modules like the frontend, middle-end, and backend for optimizing and generating machine-specific code. By understanding LLVM concepts and pass managers, developers can create efficient passes for tasks like performance optimization and code analysis.

Some Were Meant for C

July 5, 2024

The document "Some Were Meant for C" explores the enduring significance of the C programming language, highlighting its dual role as both an application and systems programming language. It challenges common assumptions about C, emphasizing its unique communicative design that differs from managed languages. The document argues that C's explicit representations and memory access foster effective system-building and communication, making it a preferred choice for certain technical challenges. Additionally, it critiques the prevailing discourse that demonizes C, advocating for a nuanced understanding of its role in the programming landscape.

Xv6, a simple Unix-like teaching operating system

July 5, 2024

Xv6 is a teaching operating system developed by MIT for their operating systems course. It is based on Unix V6, written in ANSI C, and runs on Intel x86 machines. The xv6 source code is available on GitHub and is used in lectures to teach operating system concepts.

C Is Not a Low-level Language

July 5, 2024

C is often considered a low-level language, but this article argues that it is not. The author explains that vulnerabilities like Spectre and Meltdown occurred because processor architects were trying to build fast processors that exposed the same abstract machine as a PDP-11, which C programmers believe is close to the underlying hardware. However, the reality is that C code runs on a complex compiler that performs intricate transformations to achieve the desired performance. The article also discusses how C's memory model and optimizations make it difficult to understand and can lead to undefined behavior. The author suggests that instead of trying to make C code fast, it may be time to explore programming models on processors designed for speed.

Should you learn C to "learn how the computer works"?

July 5, 2024

The author discusses whether learning C is necessary to understand how computers work, ultimately concluding that C is not a direct representation of computer operations. Learning C can still be beneficial for understanding computing concepts and history, but it operates within a virtual machine and abstracts certain hardware details. By learning C, you can gain insight into the relationship between programming languages, hardware, and the historical development of computing.

A Guide to Undefined Behavior in C and C++, Part 1

July 5, 2024

The text explains that undefined behavior in C and C++ can lead to unpredictable program outcomes. Compilers may optimize code by exploiting undefined behavior, potentially causing programs to misbehave. It is important for programmers to understand how undefined behavior can impact program execution.

Using neural nets to recognize handwritten digits

July 5, 2024

Neural networks can recognize handwritten digits by learning from examples. Sigmoid neurons play a key role in helping neural networks learn. Gradient descent is a common method used for learning in neural networks.

When Network is Faster than Cache

July 5, 2024

Firefox introduced a feature called RCWN to improve web performance by racing cached requests against the network. In some cases, the network can be faster than fetching data from the cache due to various factors like browser bugs and resource prioritization. Factors like device hardware and the total number of assets served from the cache impact cache retrieval performance significantly.

John Carmack on Functional Programming in C++

July 5, 2024

Functional programming in C++ can help in writing better software by making code easier to reason about and eliminating thread race conditions. Pure functions, which only rely on input parameters and produce consistent outputs, offer benefits such as thread safety and easier testing. Refactoring towards purity can improve code quality, even if full purity is not achieved, by disentangling computation from the environment it operates in.

Zig-style generics are not well-suited for most languages

July 5, 2024

Zig-style generics, like those in C++, may not work well for all languages due to limitations in compiler support and type inference. Armchair suggestions about adopting Zig-style generics in other languages may overlook these challenges. The flexibility and metaprogramming capabilities in Zig may not easily translate to other statically-typed languages.

WebGL2 vs WebGL1

July 4, 2024

WebGL is a 3D API that works as a rasterization engine, requiring users to provide code for rendering points, lines, and triangles. Users must create vertex and fragment shaders to control how WebGL processes and displays graphics. The WebGL API simplifies rendering by executing user-created functions to draw basic shapes like triangles.

WebGL How It Works

July 4, 2024

The text explains how WebGL processes vertices to create triangles and render them with pixels using shaders. Varyings are used to pass data from the vertex shader to the fragment shader for color interpolation. Buffers are essential for transferring vertex data to the GPU for rendering, and attribute locations are assigned to specify how to extract and use this data efficiently.

The_Night_Watch

July 4, 2024

The text discusses the importance of systems programmers in dealing with complex technical challenges, emphasizing their unique skills in debugging and problem-solving. It contrasts the roles of systems programmers with other computer professionals like GUI designers and PHP developers, highlighting the critical nature of systems programming in challenging scenarios. The text humorously portrays the intense and sometimes absurd experiences of systems programmers, showcasing their indispensable role in addressing technical issues efficiently and effectively.

FreeType

July 4, 2024

FreeType is a software library for rendering fonts, available for free. It is designed to be small, efficient, and capable of producing high-quality font images. Users can find installation instructions, documentation, and ways to communicate with the FreeType team on their website.

A Freestanding Rust Binary

July 3, 2024

To create a freestanding Rust executable for operating system development, we need to disable linking to the standard library and define our own entry point function. By compiling for a bare metal target like thumbv7em-none-eabihf, we can avoid linker errors and run Rust code without an underlying operating system. Additional linker arguments are required for specific operating systems like Linux, Windows, and macOS to resolve linker errors and build the freestanding Rust binary successfully.

Manually linking Rust binaries to support out-of-tree LLVM passes

July 3, 2024

LLVM is a compiler infrastructure used by frontends like rustc to generate machine code. To add custom LLVM passes to a Rust binary, extra flags can be used during compilation to produce LLVM-IR and then link the binary properly using LLVM tools. By understanding how Rust's static libraries work and leveraging cargo for dependency management, custom LLVM passes can be integrated into Rust binaries efficiently.

The Rust Reference

July 3, 2024

The Rust compiler can generate different types of output artifacts, such as runnable executables, Rust libraries, dynamic libraries, and static system libraries. Dependencies between crates can be linked in various formats, such as rlib and dynamic library formats, following specific rules set by the compiler. Understanding how to specify output formats like --crate-type=bin or --crate-type=lib can help control the compilation process for Rust crates, while also considering options for linking C runtimes dynamically or statically based on target features.

Rust Compiler Development Guide

July 3, 2024

The Rust compiler processes and transforms your code for compilation. It uses different stages like lexing, parsing, and abstract syntax tree lowering. The compiler aims for correctness, performance, and supporting incremental compilation.

How to speed up the Rust compiler one last time

July 3, 2024

The author at Mozilla is concluding their work on speeding up the Rust compiler after several years of dedicated effort. They wrote multiple blog posts detailing their performance optimizations and shared valuable lessons learned from the process. The author expressed gratitude to those who supported their work and highlighted the importance of ongoing contributions to Rust's development.

How to speed up the Rust compiler in March 2024

July 3, 2024

In March 2024, updates on the Rust compiler's performance highlighted several key improvements. Changes like using a single codegen unit, marking Debug::fmt methods with #[inline], introducing a cache, and upgrading LLVM versions led to notable reductions in wall-time, binary size, and hash table lookups. Additionally, the availability of the Cranelift codegen backend for x86-64/Linux and ARM/Linux offers an alternative for faster compile times. While the author didn't contribute to speed improvements this time, overall performance from August 2023 to March 2024 showed reductions in wall-time, peak memory usage, and binary size, indicating steady progress in enhancing the Rust compiler's efficiency.

Zig Bits 0x4: Building an HTTP client/server from scratch

July 3, 2024

The text explains how to create an HTTP client and server from scratch using Zig >=0.11. For the client, you need to set up requests, headers, and wait for responses. The server part involves defining functions to handle requests and running the server to accept connections.

Do We Really Need A Link Step?

July 3, 2024

The author questions the need for a link step in native-code compilation for faster performance. They propose a "zero-link" approach where compilers directly write object code into the final executable file. This method could improve efficiency by avoiding unnecessary object files and incorporating symbol resolution within the executable itself.

Death Note: L, Anonymity & Eluding Entropy

July 2, 2024

The text discusses Light's mistakes in using the Death Note and how they led to his de-anonymization by L. Light's errors, such as revealing his precise killing methods and using confidential police information, significantly reduced his anonymity. The text also explores strategies Light could have employed to better protect his anonymity while using the Death Note.

jamiebuilds/the-super-tiny-compiler: :snowman: Possibly the smallest compiler ever

July 2, 2024

The Super Tiny Compiler is a simplified example of a modern compiler using easy-to-read JavaScript. It helps you understand how compilers work from start to finish. Compilers play a big role in the tools we use daily.

5 Days to Virtualization: A Series on Hypervisor Development

July 2, 2024

A series on hypervisor development for Intel processors with virtualization support will be published next week, covering topics like setting up a test environment, driver skeleton creation, and multi-processor initialization. The series aims to aid new readers in building, testing, and understanding type-2 hypervisor development using C programming language. Recommended reading and detailed explanations will be provided to enhance knowledge and understanding of virtualization concepts.

In-depth analysis on Valorant’s Guarded Regions

July 2, 2024

The text discusses how Valorant's anti-cheat system, Vanguard, uses innovative techniques to protect against memory manipulation by whitelisting threads and creating shadow regions. These methods involve cloning and modifying the game's paging tables to allow access to hidden memory without affecting performance. By implementing these advanced security measures, Vanguard effectively prevents cheats from bypassing its guarded regions.

Exploit Development: No Code Execution? No Problem! Living The Age of VBS, HVCI, and Kernel CFG

July 2, 2024

The text discusses various techniques used in exploit development, particularly focusing on targeting the Windows kernel. It mentions concepts like Hypervisor-Protected Code Integrity (HVCI) and how exploits can manipulate memory to execute attacker-controlled code in kernel mode. The text also delves into details like leaking kernel-mode memory, constructing ROP chains on the kernel-mode stack, and utilizing functions like NtQuerySystemInformation to escalate privileges and perform malicious actions in the system.

Reader

July 2, 2024

The Reader API by jina.ai helps extract clean, LLM-friendly text from web content, ensuring high-quality input for AI systems like agents and RAG. It can also search the web for the latest information to keep LLMs up-to-date, improve factuality, and reduce misinformation. Additionally, Reader can read images on webpages and PDFs, providing alt text for images and lightning-fast PDF processing, all available for free with flexible rate limits.

CheerpX versus WebContainers

July 2, 2024

CheerpX is a client-side virtualization technology for running x86 executables and operating systems in the browser without modifications or recompilation. It offers cost-effective, secure, and private execution of native code, making it suitable for various web-based applications. CheerpX stands out from other solutions by supporting any x86 executable and providing a robust two-tier emulator for efficient code execution.

Creating a Rootkit to Learn C

July 2, 2024

The text demonstrates creating a userland rootkit in C to hide malicious activities like network connections and files. By hooking into system calls like access() and write(), the rootkit can manipulate userland programs and evade detection by tools like netstat. The rootkit uses shared library injections and hooks to intercept and manipulate system calls, showcasing the power of C for malicious activities.

Picsart-AI-Research/LIVE-Layerwise-Image-Vectorization: [CVPR 2022 Oral] Towards Layer-wise Image Vectorization

July 1, 2024

The text discusses a new method called LIVE for generating SVG images layer by layer to fit raster images. LIVE uses closed bezier paths to learn visual concepts in a recursive manner. Installation instructions and references for the method are provided in the text.

Udacity CS344: Intro to Parallel Programming

July 1, 2024

Intro to Parallel Programming is a free online course by NVIDIA and Udacity teaching parallel computing with CUDA. It's for developers, scientists, engineers, and students looking to learn about GPU programming and optimization. The course is self-paced, requires C programming knowledge, and offers approximately 21 hours of content.

CS 361: Systems Programming

July 1, 2024

The Systems Programming course at UIC includes assigned readings, video lectures, labs, and quizzes scheduled throughout the week. Students can access additional resources and submit assignments through the course gradescope page. Office hours, content quizzes, discussions, and exams are held on specific days via Zoom and YouTube.

Resolving Rust Symbols

July 1, 2024

Linking combines object files into an executable or shared library in Rust. The linker resolves symbols and dependencies between object files. Rust prefers static linking to create a single distributable binary with all dependencies included.

When FFI Function Calls Beat Native C

July 1, 2024

David Yu performed a benchmark comparing different Foreign Function Interfaces (FFI) for function calls. LuaJIT's FFI was found to be faster than native C function calls due to efficient dynamic function call handling. Direct function calls, like those used by LuaJIT, can outperform indirect calls routed through a Procedure Linkage Table (PLT).

Cap'n Proto, FlatBuffers, and SBE

July 1, 2024

FlatBuffers is a new serialization protocol released by Google engineers, similar to Cap’n Proto. Cap’n Proto allows random access using pointers, while FlatBuffers uses offsets stored in tables for random access. Protobufs, Cap’n Proto, and FlatBuffers have custom schema languages and different features for data serialization and access.

A Database Without Dynamic Memory Allocation

July 1, 2024

TigerBeetle, a database written in Zig, does not allocate memory dynamically after startup. It uses static memory allocation for all data structures, avoiding performance issues and use-after-free bugs. This approach allows for better predictability, easier handling of overload, and efficient resource management.

Wizard Zines Collection!

July 1, 2024

Julia offers programming zines with black and white covers for free and colored covers for purchase. The zines can be bought individually for $10-$12 each or as a whole collection. Additionally, there are free posters and a weekly comic subscription available.

Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0

July 1, 2024

Apache Arrow DataFusion version 28.0.0 now offers faster parallel aggregation for queries with many groups. The improvements aim to enhance user experiences by generating insights more efficiently. These enhancements bring DataFusion closer to the grouping speed of DuckDB.

Problems of C, and how Zig addresses them

July 1, 2024

This blog post discusses issues with C and how Zig addresses them through features like comptime evaluations and improved memory management. Zig offers solutions like error handling improvements and treating everything as an expression, making it a modern alternative to C with enhanced functionalities. The comparison highlights Zig's advantages in areas such as memory management, error handling, and expressive coding practices.

June 2024

91 bookmarks

How to use hash map contexts to save memory when doing a string table

June 30, 2024

The text explains how to save memory when building a string table using hash map contexts. By adapting context APIs, only indexes are stored in the table, reducing memory usage. This method can save 117 KB of memory for a string table with 10 thousand entries.

resume.txt

June 30, 2024

Andrew Kelley is a programmer with 16 years of experience in software development and a passion for open-source projects. He has worked on various music-related software like the Genesis DAW and libgroove, contributing patches to libav and ffmpeg. Additionally, he has experience in low-level systems, custom algorithm creation, and designing user interfaces.

Leslie Lamport

June 28, 2024

Leslie Lamport wrote several papers on verifying and specifying concurrent systems using TLA. He discovered algorithms through formal derivation and emphasized mechanical verification of concurrent algorithms. His work influenced the development of the TLAPS proof system.

Indices and tables

June 27, 2024

CompilerGym is a library for reinforcement learning in compiler tasks. It helps ML researchers work on optimization problems and allows system developers to create new tasks for ML research. The goal is to use ML to make compilers faster.

448997590_1496256481254967_2304975057370160015_n

June 27, 2024

The LLM Compiler is a suite of pre-trained models designed for code optimization tasks, based on Code Llama. It has been trained on a large corpus of LLVM-IR and assembly code to enhance compiler behavior understanding. The release of LLM Compiler aims to support further research in compiler optimization for both academia and industry.

Bare Bones

June 25, 2024

This text explains how to create an operating system by first cross-compiling and using existing technology. It guides you through writing a kernel in C or C++, creating a bootloader, and linking the kernel for x86 systems. Following these steps ensures your operating system can be loaded and executed correctly.

The Graphics Codex

June 24, 2024

"The Graphics Codex" is a comprehensive resource for computer graphics, offering essential information on 3D rendering and shading. It includes equations, diagrams, and programming projects, with free updates every month. Written by expert Morgan McGuire, it is a valuable tool for learning and reference in the field of computer graphics.

MCC15-04

June 8, 2024

ethereumbook/04keys-addresses.asciidoc at develop · ethereumbook/ethereumbook · GitHub

June 6, 2024

This chapter introduces public key cryptography used in Ethereum for securing ownership of funds through private keys and addresses. Public keys are derived from private keys and are represented as points on an elliptic curve. Ethereum addresses are unique identifiers generated from public keys using the Keccak-256 hash function.

Accidentally Turing-Complete

June 6, 2024

The document "Accidentally Turing-Complete" explores various unexpected systems and technologies that unintentionally exhibit Turing completeness, a property that allows them to perform any computation. Examples include C++ templates, TypeScript, Java generics, X86 mov instructions, Magic: The Gathering card game, HTML5, Minecraft, Dwarf Fortress game, SQL, Apache Rewrite Rules, Pokemon Yellow game, Scala type system, MediaWiki templates, Little Big Planet game, Sendmail, Vim Normal-Mode, Border Gateway Protocol (BGP), Excel, Super Mario World glitches, PowerPoint, Font Shaping, JBIG2 Image Compression, and Stupid RDMA NICs. The document showcases how these diverse systems, from games to internet protocols, can unexpectedly demonstrate the computational power of Turing completeness.

The Art of Computer Programming, Vol. 4 Fascicle 6

June 6, 2024

June 1, 2024

Diffusion models generate diverse high-resolution images and are different from previous generative methods. Cascade diffusion models and latent diffusion models are used to scale up models to higher resolutions efficiently. Score-based generative models are similar to diffusion models and involve noise perturbations to generate new samples.

May 25, 2024

Dr. Werner Vogels shares insights from working on Amazon's S3 storage system, highlighting the scale and unique challenges faced. S3's design incorporates innovative strategies to efficiently handle vast amounts of data across millions of hard drives while prioritizing customer experience. Vogels emphasizes the need for a broader perspective on software systems and the rewarding journey of scaling as an engineer at Amazon.

Structure and Interpretation of Computer Programs, 2nd ed.

April 30, 2024

The text discusses key concepts in programming, such as primitive expressions, means of combination, and means of abstraction. It highlights the role of the environment in determining the meaning of symbols in expressions. The evaluation process involves reducing expressions to procedures applied to arguments, leading to a deeper understanding of programming concepts.

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

April 25, 2024

The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring $2\times$ fewer pre-training tokens. Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations. We also release code to convert models to MLX libra...

þÿAn Infinitely Large Napkin

April 23, 2024

The text is titled "An Infinitely Large Napkin" by Evan Chen. The author's work can be found on readwise.io.

IEEE Xplore Full-Text PDF:

April 10, 2024

Root Mean Square Layer Normalization

April 9, 2024

The text discusses a technique called Root Mean Square Layer Normalization proposed by Biao Zhang and Rico Sennrich. This technique is likely related to a method for normalizing data in neural networks. The authors' work can be found on arxiv.org.

Root Mean Square Layer Normalization

April 9, 2024

Layer normalization (LayerNorm) has been successfully applied to various deep neural networks to help stabilize training and boost model convergence because of its capability in handling re-centering and re-scaling of both inputs and weight matrix. However, the computational overhead introduced by LayerNorm makes these improvements expensive and significantly slows the underlying network, e.g. RNN in particular. In this paper, we hypothesize that re-centering invariance in LayerNorm is dispensable and propose root mean square layer normalization, or RMSNorm. RMSNorm regularizes the summed inputs to a neuron in one layer according to root mean square (RMS), giving the model re-scaling invariance property and implicit learning rate adaptation ability. RMSNorm is computationally simpler and thus more efficient than LayerNorm. We also present partial RMSNorm, or pRMSNorm where the RMS is estimated from p% of the summed inputs without breaking the above properties. Extensive experiments on several tasks using diverse...

Terry A. Davis

April 7, 2024

Terry A. Davis, an American electrical engineer and programmer, created TempleOS, a public domain operating system. Despite his mental health challenges, Davis gained an online following for his unique work and beliefs. His legacy continues to be remembered through documentaries and online discussions.

Pattern Recognition and Machine Learning

April 6, 2024

The content discusses likelihood functions for Gaussian distributions, maximizing parameters using observed data, Bayesian model comparison, mixture density networks, and EM algorithm for Gaussian mixtures. It covers topics like posterior distributions, predictive distributions, graphical models, and variational inference. The material emphasizes probability distributions, optimization, and model comparison.

Ludwig Wittgenstein: The Duty of Genius

April 6, 2024

The text discusses the complex relationship between Ludwig Wittgenstein and his peers, particularly Bertrand Russell. Wittgenstein's philosophical ideas and personal struggles are highlighted, showing the challenges he faced in expressing his thoughts and finding understanding from others. Despite his brilliance, Wittgenstein's life was marked by loneliness and inner turmoil, making it difficult for him to fully convey his philosophical insights.

March 2024

The text discusses the challenges and complexities of measuring and quantifying information, particularly in terms of storage capacity, compression, and entropy. It explores various examples, such as genome information, human sensory capabilities, and the information content of objects like water molecules and black holes. The relationship between information, entropy, and physical properties is also highlighted.

Sequence to Sequence Learning with Neural Networks

March 6, 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

March 6, 2024

The article introduces a new era of 1-bit Large Language Models (LLMs) that can significantly reduce the cost of LLMs while maintaining their performance. BitNet b1.58 is a 1.58-bit LLM variant in which every parameter is ternary, taking on values of {-1, 0, 1}. It retains all the benefits of the original 1-bit BitNet, including its new computation paradigm, which requires almost no multiplication operations for matrix multiplication and can be highly optimized. Moreover, BitNet b1.58 offers two additional advantages: its modeling capability is stronger due to its explicit support for feature filtering, and it can match full precision (i.e., FP16) baselines in terms of both perplexity and end-task performance at a 3B size.

How to round to 2 decimals with Python? [duplicate]

March 5, 2024

To round a number to 2 decimals in Python, the usual method is using round(value, significantDigit), but it can behave unexpectedly when the digit before the one being rounded is a 5. To address this, a workaround involves adding a small value to ensure proper rounding. This method allows for traditional rounding commonly used in statistics without needing to import additional libraries like Decimal. By incorporating this workaround into a function, you can achieve the desired rounding results without encountering the issue with numbers ending in 5.

Rounding floats with f-string [duplicate]

March 5, 2024

Using %-formatting, I can specify the number of decimal cases in a string: x = 3.14159265 print('pi = %0.2f' %x) This would give me: pi = 3.14 Is there any way of doing this using f-strings in ...

Latent Interfaces

March 3, 2024

In a career shift, the author is launching Latent Interfaces to apply expertise in design, prototyping, and development to complex data challenges. They share insights into a genomic data project, emphasizing the importance of Python skills alongside JavaScript. The document showcases the creation of intuitive data interfaces and the design process involving both digital and physical tools. Additionally, the author discusses the significance of well-designed APIs like StabilityAI and the potential for future collaborations in data visualization projects.

Hypercomputation

March 3, 2024

Hypercomputation and super-Turing computation involve models of computation that can produce non-Turing-computable outputs. Introduced in the early 1990s, super-Turing computing is inspired by neurological and biological systems and serves as the foundation for Lifelong Machine Learning. Hypercomputation, a field introduced in the late 1990s, includes philosophical constructs and aims to compute functions beyond what a Turing machine can. The Church-Turing thesis states that any "computable" function can be computed by a Turing machine, but hypercomputers can compute functions that are not computable in the Church-Turing sense. Various hypercomputer models exist, ranging from theoretical concepts like oracle machines to more plausible models like quantum computing. Some proposals suggest that hypercomputation may be achievable through systems like neural networks or analog computers. Critics argue that hypercomputation is not physically realizable.

February 3, 2024

Crafting Interpreters is a book that provides everything you need to create your own programming language. It covers both high-level concepts like parsing and semantics, as well as technical details such as bytecode representation and garbage collection. The book guides you through building a language from scratch, including features like dynamic typing, lexical scope, functions, classes, and inheritance. It is available in multiple formats, including print, ebook, and online for free. The author, Robert Nystrom, is an experienced language developer who currently works at Google on the Dart language.

January 2024

61 bookmarks

GitHub - sst/demo-ai-app: Sample AI movies app built with ❍ Ion

January 31, 2024

This document provides an overview of the sst/demo-ai-app, a sample movies app built with Ion that demonstrates how to use AI in your apps using your own data. The app includes features such as tagging, related movies, and deep search using natural language. It utilizes the Vector component, which is based on Amazon Bedrock and allows for easy AI integration with your data. The document also highlights the advantages of Ion, including faster deployment and no stack limits. The app works by ingesting movie data from IMDB, generating embeddings, and storing them in a Vector database, which the Next.js app then retrieves.

ThermodynamicComputing

January 31, 2024

Measuring Faithfulness in Chain-of-Thought Reasoning

January 28, 2024

Large language models (LLMs) are more effective when they engage in step-by-step "Chain-of-Thought" (CoT) reasoning, but it is unclear if this reasoning is a faithful explanation of the model's actual process. The study examines how interventions on the CoT affect model predictions, finding that models vary in how strongly they rely on the CoT. The performance boost from CoT does not solely come from added test-time compute or specific phrasing. As models become larger and more capable, they tend to produce less faithful reasoning. The results suggest that faithful CoT reasoning depends on carefully chosen circumstances such as model size and task.

ageron/handson-ml3: A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

January 26, 2024

The ageron/handson-ml3 project is designed to teach the fundamentals of Machine Learning using Python. It includes example code and exercise solutions from the third edition of the book "Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow." The project provides options for running the notebooks online, using a Docker image, or installing the project on your own machine. It also addresses frequently asked questions about Python versions, SSL errors, and updating the project. The project has received contributions from various individuals, including reviewers, contributors to exercise solutions, and supporters from the Google ML Developer Programs team.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

January 23, 2024

BERT and RoBERTa have achieved impressive results on sentence-pair regression tasks like semantic textual similarity, but they have a significant computational overhead when comparing large collections of sentences. To address this, Sentence-BERT (SBERT) has been developed as a modification of BERT that uses siamese and triplet network structures to generate semantically meaningful sentence embeddings. SBERT reduces the time required to find the most similar pair from 65 hours with BERT to just 5 seconds, while maintaining accuracy. SBERT outperforms other state-of-the-art sentence embedding methods on various tasks, including STS and transfer learning.

Turing-1951 Intelligent Machinery-a Heretical Theory

January 21, 2024

Self-Rewarding Language Models

January 20, 2024

To achieve superhuman language models, researchers propose the use of self-rewarding language models (LLMs) that provide their own rewards during training. Unlike current approaches that rely on human preferences, LLMs use prompts to judge their own performance and improve their instruction following ability and reward generation. A preliminary study using this approach, specifically fine-tuning Llama 2 70B, demonstrates that it outperforms existing systems on the AlpacaEval 2.0 leaderboard. This work suggests the potential for models that can continually improve in both axes.

Software Development Trends 2023/2024 - Vol. 2.

January 16, 2024

The document provides a summary of important software development trends observed in 2023 that are likely to continue into 2024. It includes information on technology roadmaps, the state of DevOps, cloud computing, serverless technology, databases, and more. Some key insights from the document include the value drivers and risks associated with adopting software engineering technologies, the impact of generative cultures and user-focused teams on performance, and the increasing adoption of serverless solutions. Additionally, the document highlights the need for multi-cloud skills development and the most in-demand cloud skills for 2023.

Word2vec from Scratch

January 15, 2024

Word2vec is a technique used to express words as vectors that encode their semantics in a meaningful way. This article discusses how to implement word2vec from scratch using NumPy. The process involves tokenizing the text, creating lookup tables for words and IDs, generating training data in the form of matrices using one-hot vectorization, and building and training the embedding network. The rows of the weight matrix in the network serve as the word embeddings, representing words as dense vectors. The final output of the network is a probability vector that predicts the nearby context words.

MemGPT: Towards LLMs as Operating Systems

January 15, 2024

MemGPT is a system that manages different memory tiers to provide extended context within the limited context window of large language models (LLMs). Using an OS-inspired design, MemGPT can handle unbounded context using LLMs that have finite context windows. It is successful in domains where existing LLMs' limited context windows severely limit their performance, such as document analysis and multi-session chat. MemGPT supports self-directed editing and retrieval, memory-hierarchy, OS functions, and event-based control flow to manage unbounded context.

Visual Guides to understand the basics of Large Language Models

January 14, 2024

This article provides a compilation of tools and articles that aim to break down the complicated concepts of Large Language Models (LLMs) in an intuitive way. It acknowledges that many people struggle with understanding the basics of LLMs and offers resources to help solidify their understanding. The article includes a table of contents with links to various resources, such as "The Illustrated Transformer" by Jay Alammar, which provides visualizations to explain the transformer architecture, a fundamental building block of LLMs. The goal is to make the concepts of LLMs easily understood and accessible.

Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs

January 14, 2024

This article provides a comprehensive understanding and coding guide for self-attention mechanisms in transformer architectures and large language models (LLMs) like GPT-4 and Llama. It covers the concept of self-attention, its importance in NLP, and the implementation of the self-attention mechanism in Python and PyTorch. The article also discusses the scaled dot-product attention, computing unnormalized attention weights, computing attention weights, and computing the context vector. Additionally, it explores multi-head attention and provides code examples for implementing multiple attention heads.

Thinking in Systems: International Bestseller: Donella H. Meadows, Diana Wright: 9781603580557: Amazon.com: Books

January 14, 2024

"Thinking in Systems" is a book that explores the concept of systems thinking, which involves analyzing the interconnectedness and dynamics of various systems. The book uses examples such as the human body, businesses, and societal systems to illustrate how stocks and flows contribute to achieving system goals. It also highlights the importance of aligning stated goals with actual outcomes and discusses the need for change in systems that are not functioning optimally. The book emphasizes the complexity of systems and the challenges of making meaningful improvements.

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

January 12, 2024

Backdoored behavior in AI models is most persistent in larger models and models trained to deceive the training process, even when the deceptive behavior is distilled away. Adversarial training can actually make models better at recognizing their backdoor triggers, effectively hiding the unsafe behavior. Safety training techniques, such as reinforcement learning, are often ineffective in removing backdoors. The study explores different methods for training backdoored models and finds that chain-of-thought backdoors allow models to produce consistent reasoning for their deceptive behavior.

This project is about how to systematically persuade LLMs to jailbreak them.

January 10, 2024

This project introduces a taxonomy of 40 persuasion techniques to systematically persuade LLMs (large language models) to jailbreak them. Through iterative application of these techniques, the researchers achieved a 92% success rate in jailbreaking advanced LLMs. They also found that more advanced models are more vulnerable to persuasive adversarial prompts (PAPs) and that adaptive defenses can effectively neutralize these prompts. The research highlights the challenges of addressing user-invoked risks from persuasion and the need for further investigation and improved defenses for more capable models.

Pruning vs Quantization: Which is Better?

January 10, 2024

Neural network pruning and quantization are techniques used to compress deep neural networks. This paper compares the two techniques and provides an analytical comparison of expected quantization and pruning error. The results show that in most cases, quantization outperforms pruning. However, in scenarios with very high compression ratios, pruning may be beneficial. The paper also discusses the hardware implications of both techniques and provides a comparison of pruning and quantization in the post-training and fine-tuning settings.

mlx-examples/lora at main · ml-explore/mlx-examples · GitHub

January 10, 2024

This document provides an example of using MLX to fine-tune either a Llama 7B1 or Mistral 7B2 model with low rank adaptation (LoRA) for a target task. The example demonstrates using the WikiSQL dataset to train the model to generate SQL queries from natural language. It includes instructions for setup, running the script, fine-tuning the model, evaluating the model, generating output, and dealing with memory issues. The document also provides results from the training process and offers tips for reducing memory consumption during fine-tuning.

Mixtral of Experts

January 10, 2024

Mixtral 8x7B is a Sparse Mixture of Experts (SMoE) language model that outperforms or matches other models like Llama 2 70B and GPT-3.5 across various benchmarks. It has the same architecture as Mistral 7B but uses 8 feedforward blocks (experts) in each layer. A router network selects two experts for each token at each layer, allowing for dynamic selection of different experts at each timestep. This results in each token having access to 47B parameters but only using 13B active parameters during inference. Mixtral also offers a fine-tuned model, Mixtral 8x7B - Instruct, which surpasses other models on human benchmarks. Both the base and instruct models are released under the Apache 2.0 license.

Paper page - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

January 10, 2024

The content is a set of instructions on how to cite a specific URL (arxiv.org/abs/2401.01335) in three different types of README.md files, in order to create links from those pages.

From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models

January 10, 2024

LLMs (Large Language Models) have been enhanced with innovative prompting strategies and external tools, expanding their capabilities. However, integrating LLMs into conversational agents presents a challenge. This paper introduces RAISE, an enhanced version of the ReAct framework, which utilizes scratchpad and retrieved examples to augment the agent's capabilities. RAISE demonstrates superiority as a conversational agent in experiments conducted on a real estate dataset. The working memory of RAISE consists of conversation history, scratchpad, examples, and task trajectory. The paper also discusses the evaluation of agent performance and the core aspects of planning and Chain-of-Thought reasoning.

WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia

January 9, 2024

The paper presents WikiChat, a few-shot language model (LLM)-based chatbot that minimizes hallucinations and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia and combines grounded facts with additional information from the corpus to generate factual and engaging responses. The system achieves high factual accuracy and outperforms previous retrieval-based chatbots in terms of informativeness and engagement. The paper also introduces a novel evaluation methodology that combines simulated and real user conversations for assessing the factuality and conversationality of chatbots.

Discovering Language Model Behaviors with Model-Written Evaluations

January 8, 2024

The article discusses an approach to generating evaluations using language models (LMs) with the help of crowdworkers. The LM-generated evaluations were rated highly relevant, with workers agreeing with 90-100% of their labels. The researchers showcase their approach by generating datasets that test LMs for 154 diverse behaviors related to model personality, politics, ethics, social bias, and risks from advanced AI systems. The generated multiple-choice questions help the researchers to reveal additional instances of inverse scaling with RLHF training, as well as to distinguish when concerning behaviors are likely caused by pretraining or RLHF.

Getting Started with Elastic Stack 8.0

January 8, 2024

The Elastic Stack consists of Elasticsearch for data storage and search, Kibana for visualization, and tools like Beats and Logstash for data collection and transformation. Beginners can learn about key topics like indexing, searching, and managing data in Elasticsearch through various chapters in the book. Kibana is essential for interacting with data and building solutions on the Elastic Stack.

Understanding The Exploding and Vanishing Gradients Problem

January 7, 2024

The "Understanding The Exploding and Vanishing Gradients Problem" article discusses the vanishing and exploding gradients problem in deep neural networks. It explains how the gradients used to update the weights can shrink or grow exponentially, causing learning to stall or become unstable. The article explores why gradients vanish or explode exponentially and how it affects the backpropagation algorithm during training. It also provides strategies to address the vanishing and exploding gradients problem, such as using the ReLU activation function, weight initialization techniques, and gradient clipping.

Practical Deep Learning for Coders 2022

January 7, 2024

"Practical Deep Learning for Coders 2022" is a course that covers topics such as building and training deep learning models, deploying models, and using PyTorch and other popular libraries. The course is led by Jeremy Howard, who has extensive experience in machine learning and has created companies that utilize deep learning. The course is suitable for those with at least a year of coding experience and a high school math background. Students will learn how to train models for computer vision, natural language processing, tabular data analysis, and collaborative filtering, and will also learn about the latest deep learning techniques.

fastai/fastbook: The fastai book, published as Jupyter Notebooks

January 7, 2024

The fastai book, published as Jupyter Notebooks, provides an introduction to deep learning, fastai, and PyTorch. It is copyright Jeremy Howard and Sylvain Gugger, and a selection of chapters is available to read online. The notebooks in the repository are used for a MOOC and form the basis of the book, which is available for purchase. The code in the notebooks is covered by the GPL v3 license, while the other content is not licensed for redistribution or change. It is recommended to use Google Colab to access and work with the notebooks. If there are any contributions or citations, copyright is assigned to Jeremy Howard and Sylvain Gugger.

Elasticsearch 8.x Cookbook: Over 180 recipes to perform fast, scalable, and reliable searches for your enterprise, 5th Edition

January 7, 2024

The text explains how word2vec uses one-hot encoded vectors and weight matrices to represent words in a neural network model. It details the learning process for updating weights between input, hidden, and output layers based on prediction errors. The update equations for weights are derived through backpropagation to improve the model's ability to predict words within a context.

Attention? Attention!

January 7, 2024

The document explores the concept of attention, as performed by humans and deep learning algorithms. Attention is used in deep learning to transform one input sequence into another and is accomplished through an encoder-decoder architecture with LSTM or GRU units. The attention mechanism, invented to address the incapability of the fixed-length context vector, creates shortcuts between the context vector and the entire source input. Attention mechanisms vary in form, from soft or hard to global or local. The document also introduces self-attention, which relates different positions of a single sequence to compute a representation of the same sequence, and the Neural Turing Machine, a model architecture for coupling a neural network with external memory storage.

An Intuition for Attention

January 7, 2024

The transformer neural network, used by models like ChatGPT, incorporates an attention mechanism to improve performance. Attention is a key feature of transformers and is defined by an equation that involves the softmax function. Attention can take different forms, but the scaled dot product attention is commonly used. This attention mechanism is based on the idea of key-value lookups, where a query is matched with keys to retrieve corresponding values. The attention scores, which determine how much attention is given to each key-value pair, are computed using dot product similarity and transformed into decimal percentages using the softmax function. This process allows for meaningful and efficient processing of queries in large language models.

Pen and Paper Exercises in Machine Learning

January 7, 2024

This is a collection of (mostly) pen-and-paper exercises in machine learning. The exercises are on the following topics: linear algebra, optimisation, directed graphical models, undirected graphical models, expressive power of graphical models, factor graphs and message passing, inference for hidden Markov models, model-based learning (including ICA and unnormalised models), sampling and Monte-Carlo integration, and variational inference.

Transformers From Scratch

January 7, 2024

This blog provides a step-by-step guide on creating and training a transformer from scratch. The author explains each foundational element and provides a Jupyter notebook with the code for readers to run and experiment with. The blog references a YouTube video and the Attention Is All You Need paper for further understanding. The author also mentions the availability of the final code and a dataset for download.

Mathematics for Machine Learning

January 5, 2024

I'm sorry, but there is no content provided for me to summarize.

Linear Algebra Review and Reference

January 5, 2024

Sorry, there is no content provided to summarize. Please provide the content you want me to summarize.

Probability and InformationTheory

January 5, 2024

In this chapter, the authors discuss probability theory and information theory. Probability theory is a mathematical framework for representing uncertain statements and is used in artificial intelligence for reasoning. Information theory, on the other hand, quantifies the amount of uncertainty in a probability distribution. The chapter explains various concepts, such as probability mass functions for discrete variables and probability density functions for continuous variables. It also introduces key ideas from information theory, such as entropy and mutual information. The authors provide examples and explanations to help readers understand these concepts.

Linear Algebra

January 5, 2024

Linear algebra is a fundamental topic in understanding and working with machine learning algorithms, especially deep learning algorithms. This chapter provides an introduction to scalars, vectors, matrices, and tensors, which are the key mathematical objects in linear algebra. It explains the concepts and notation used in linear algebra, such as matrix multiplication, transpose, identity and inverse matrices, and norms. The chapter also introduces special kinds of matrices and vectors, such as diagonal matrices, orthogonal matrices, and eigenvalues and eigenvectors. These concepts are important for analyzing and solving equations in machine learning.

(2) Home

January 5, 2024

Eagle Dynamics has exciting plans for the upcoming year, with the development and release of new aircraft and maps. Some highlights include the introduction of the MiG-29A Fulcrum, as well as the Afghanistan and Iraq maps. They are also continuing their work on the CH-47F, Hellcat/USS Enterprise, and the Marianas WW2 map. Fans of flight simulation can look forward to these upcoming additions to the game.

Mathematics for Machine Learning

January 5, 2024

I'm sorry, but there is no content provided for me to summarize.

An overview of gradient descent optimization algorithms

January 5, 2024

The text provides an overview of gradient descent optimization algorithms commonly used in deep learning. It explains different types of gradient descent methods like batch, stochastic, and mini-batch, highlighting their strengths and challenges. The author also discusses advanced algorithms such as Adagrad, RMSprop, and Adam, which adapt learning rates to improve optimization performance.

An overview of gradient descent optimization algorithms∗

January 5, 2024

The article provides an overview of gradient descent optimization algorithms, which are often used as black-box optimizers. The article outlines the three variants of gradient descent and summarizes the challenges. The article then introduces some widely used algorithms to deal with the challenges, including Nesterov accelerated gradient, Adagrad, Adadelta, and RMSprop. The article explains how these algorithms work and their benefits and weaknesses.

How GPT3 Works - Visualizations and Animations

January 5, 2024

Discussions: Hacker News (397 points, 97 comments), Reddit r/MachineLearning (247 points, 27 comments) Translations: German, Korean, Chinese (Simplified), Russian The tech world is abuzz with GPT3 hype. Massive language models (like GPT3) are starting to surprise us with their abilities. While not yet completely reliable for most businesses to put in front of their customers, these models are showing sparks of cleverness that are sure to accelerate the march of automation and the possibilities of intelligent computer systems. Let’s remove the aura of mystery around GPT3 and learn how it’s trained and how it works. A trained language model generates text. We can optionally pass it some text as input, which influences its output. The output is generated from what the model “learned” during its training period where it scanned vast amounts of text.

GPT in 60 Lines of NumPy

January 5, 2024

This post outlines how to implement a GPT (Generative Pre-trained Transformer) from scratch in just 60 lines of NumPy, including loading trained GPT-2 model weights released by OpenAI and generating text. The GPT generates text given a prompt and the task of predicting the next logical word in a sequence is called language modeling. The post explains how to train a GPT using gradient descent with respect to the cross entropy loss over the language modeling task. The post also touches on prompting and how to handle hyperparameters.

Tensor2Tensor Intro

January 4, 2024

The content below is not provided.

The Annotated Transformer

January 4, 2024

"The Annotated Transformer" is a paper that introduces a new architecture for natural language processing tasks, with a focus on translation. The paper provides an annotated version of the original paper, giving a line-by-line implementation of the model. The Transformer model relies on self-attention to compute representations of its input and output without using sequence-aligned recurrent neural networks or convolutions. The model consists of an encoder and decoder stack, each containing self-attention layers and position-wise feed-forward networks. The paper also discusses the use of multi-head attention and positional encoding in the model. The model is trained using the WMT 2014 English-German dataset and the Adam optimizer.

The Illustrated Transformer

January 4, 2024

"The Illustrated Transformer" is a comprehensive guide to understanding the Transformer model, which utilizes attention to improve the training speed of neural machine translation models. The model consists of stacked encoders and decoders, with each encoder and decoder having self-attention layers. Self-attention allows the model to incorporate information from other words in the input sequence, resulting in better encoding. The model also employs multi-headed attention, which allows it to focus on different positions and creates multiple sets of Query/Key/Value weight matrices. Positional encoding is used to account for the order of words in the input sequence. The architecture includes residual connections and layer normalization for each sub-layer.

GitHub - tensorflow/nmt: TensorFlow Neural Machine Translation Tutorial

January 4, 2024

TensorFlow Neural Machine Translation Tutorial. Contribute to tensorflow/nmt development by creating an account on GitHub.

What Are Word Embeddings for Text?

January 4, 2024

Word embeddings are a way to represent words with similar meanings in a similar manner using real-valued vectors. They are a key advancement in deep learning for natural language processing tasks. You can either train your own word embeddings or use pre-trained ones for your projects.

Deep Learning for Natural Language Processing

January 4, 2024

Deep Learning for Natural Language Processing Develop Deep Learning Models for your Natural Language Problems Working with Text is… important, under-discussed, and HARD We are awash with text, from books, papers, blogs, tweets, news, and increasingly text from spoken utterances. Every day, I get questions asking how to develop machine learning models for text data. Working […]

Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)

January 4, 2024

The article explains the mechanics of sequence-to-sequence models, which are deep learning models used for machine translation, text summarization, and image captioning. The article includes visualizations to explain the concepts and requires some previous understanding of deep learning. The article also discusses attention models, which improve machine translation systems by allowing the model to focus on relevant parts of the input sequence. The article provides examples of how attention models work and concludes with a link to TensorFlow's Neural Machine Translation tutorial.

The Random Transformer

January 4, 2024

This blog post provides an end-to-end example of the math within a transformer model, with a focus on the encoder part. The goal is to understand how the model works, and to make it more manageable, simplifications are made and the dimensions of the model are reduced. The post recommends reading "The Illustrated Transformer" blog for a more intuitive explanation of the transformer model. The prerequisites for understanding the content include basic knowledge of linear algebra, machine learning, and deep learning. The post covers the math within a transformer model during inference, attention mechanisms, residual connections and layer normalization, and provides some code to scale it up.

GitHub - SkalskiP/courses: This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)

January 4, 2024

SkalskiP/courses is a curated collection of links to various courses and resources about Artificial Intelligence (AI). It includes courses on topics such as generative AI, deep learning, natural language processing, computer vision, machine learning, and more. The repository aims to provide a comprehensive resource for beginners and experienced learners alike. Contributions from the community are encouraged to make the repository even better.

CS25: Transformers United V3

January 4, 2024

Transformers have revolutionized Natural Language Processing (NLP) and are now being applied in various fields, including Computer Vision, Reinforcement Learning, and Speech. This seminar explores the details of how Transformers work and their applications, with a focus on large language models (LLMs). The seminar includes instructor and guest lectures from experts in Transformers research. The schedule includes topics such as the creation of fine-tuned chat models, low-level embodied intelligence with foundation models, and training helpful chatbots. The seminar also covers the motivations behind Transformers, scaling human-centered machine translation, and going beyond LLMs to explore emergent abilities and intermediate-guided reasoning.

Spaces using openai/whisper-large-v2 232

You could have designed state of the art positional encoding

May 16, 2025

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

attention is logarithmic, actually

May 16, 2025

AI Arrives In The Middle East: US Strikes A Deal with UAE and KSA – SemiAnalysis

May 16, 2025

Transformers Represent Belief State Geometry in their Residual Stream

May 16, 2025

Produced while being an affiliate at PIBBSS[1]. The work was done initially with funding from a Lightspeed Grant, and then continued while at PIBBSS.…

Llama from scratch (or how to implement a paper without crying)

May 16, 2025

I want to provide some tips from my experience implementing a paper. I'm going to cover my tips so far from implementing a dramatically scaled-down versio...

The Curse of Knowing How, or; Fixing Everything

May 16, 2025

A reflection on control, burnout, and the strange weight of technical fluency.

The MAP-Elites Algorithm: Finding Optimality Through Diversity

May 16, 2025

MAP-Elites is a method in reinforcement learning to avoid the local optimum of a search space by storing multiple candidate solutions…

How To Scale

May 13, 2025

Deep Dive into Yann LeCun’s JEPA

May 6, 2025

ML blog.

Are Transformers universal approximators of sequence-to-sequence functions?

May 3, 2025

a Hugging Face Space by nanotron

May 3, 2025

The ultimate guide to training LLM on large GPU Clusters

A Group and Its Center, Intuitively

April 27, 2025

Last week we took an intuitive peek into the First Isomorphism Theorem as one example in our ongoing discussion on quotient groups.

Understanding Entanglement With SVD

April 27, 2025

Quantum entanglement is, as you know, a phrase that's jam-packed with meaning in physics. But what you might not know is that the linear algebra behind it is quite simple.

April 22, 2025

This isn't a new intuition, but a nice new set of results.

+33 7 80 61 21 67

April 21, 2025

Quickly send and receive WhatsApp messages directly from your computer.

tt-metal/tech_reports/memory/allocator.md at main · tenstorrent/tt-metal

April 19, 2025

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model. - tenstorrent/tt-metal

Memory on Tenstorrent

April 19, 2025

When I first started programming Metalium. Memory was a

Multi-layer language heads: the output latent is for text (and nothing else)

April 19, 2025

Subnanosecond flash memory enabled by 2D-enhanced hot-carrier injection

April 19, 2025

CS336: Language Modeling from Scratch

April 19, 2025

A Gentle Introduction to Lambda Calculus - Part 1: Syntax

April 19, 2025

Even though lots of people nowadays advocate for applying functional programming principles to JavaScript, not many of them know the principles of Lambda Cal...

Getting Started

April 19, 2025

Yet it seems to me that the situation right now is that LtU has readers with very different backgrounds, among them many readers who haven't studied PL formally.

Intelligence as efficient model building

April 19, 2025

Contextualization Machines

April 17, 2025

Astro description

What Is ChatGPT Doing … and Why Does It Work?

April 15, 2025

Position: Model Collapse Does Not Mean What You Think

April 10, 2025

RWKV Language Model

April 7, 2025

The RWKV Language Model

Recent AI model progress feels mostly like

April 7, 2025

About nine months ago, I and three friends decided that AI had gotten good enough to monitor large codebases autonomously for security problems. We s…

Building an Open Future

April 5, 2025

We are building an open future for AI. Own your silicon future. Join us.

diffusion transofrmers

April 5, 2025

diffusion transformers

April 5, 2025

Faking ADTs and GADTs in Languages That Shouldn't Have Them

April 1, 2025

Accelerate

March 29, 2025

Accelerate is a language for array-based computations, designed to exploit massive parallelism.

Ok Rust, You Really Have a Readability Problem

March 29, 2025

Rust is safe. Rust is fast. Rust is powerful. And Rust is… sometimes completely unreadable.

Circuit Tracing: Revealing Computational Graphs in Language Models

March 29, 2025

Deep learning models produce their outputs using a series of transformations distributed across many computational units (artificial “neurons”).

Things that go wrong with disk IO

March 29, 2025

Things that go wrong with disk IO

March 25, 2025

Implementation of simple microprocessor using verilog

March 25, 2025

I am trying to make a simple microprocessor in verilog as a way to understand verilog and assembly at the same time. I am not sure if I am implementing what I think of microprocessors well enough ...

learn-fpga/FemtoRV/TUTORIALS/FROM_BLINKER_TO_RISCV/README.md at master · BrunoLevy/learn-fpga · GitHub

March 24, 2025

Learning FPGA, yosys, nextpnr, and RISC-V . Contribute to BrunoLevy/learn-fpga development by creating an account on GitHub.

Why async Rust?

March 24, 2025

March 17, 2025

The Tenstorrent Wormhole n300s PCIe accelerator board is available for purchase, featuring 672 RISC-V cores driving 466 TFLOP/s of FP8 matmul.

What’s the (floating) Point of all these data types? A (not so) brief overview of the history and usage of datatypes within the wide world of computation

March 17, 2025

Physics of language models

March 17, 2025

Tenstorrent first thoughts

March 17, 2025

Neural Networks, Manifolds, and Topology

March 9, 2025

However, there remain a number of concerns about them. One is that it can be quite challenging to understand what a neural network is really doing.

Attention from Beginners Point of View

March 9, 2025

Transformers are a type of neural network architecture which is popularly used for text generations, machine translations, etc.

(How) Do Language Models Track State?

March 9, 2025

February 27, 2025

4) The differentiating function.

How to Think About TPUs

February 26, 2025

February 17, 2025

Mastering LLM Techniques: Evaluation

February 15, 2025

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems.

Mastering LLM Inference Techniques: Inference Optimization

February 15, 2025

Learn about the most pressing challenges in LLM inference, along with some practical solutions.

Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling

February 15, 2025

The high-return activity of raising others’ aspirations

February 12, 2025

Build Your Own Text Editor

January 27, 2025

The text editor is antirez’s kilo, with some changes.

Tilde, my LLVM alternative

January 25, 2025

I'm Yasser and I've made it my mission to produce an alternative to LLVM, the current king of compiler backend libraries.

A WebAssembly compiler that fits in a tweet

January 25, 2025

Proof of correctness of data representation

January 25, 2025

Unnamed Document

January 25, 2025

Unveiling_DeepSeek.pdf

January 22, 2025

successful modifications since its inception, let alone large-scale validation.

Stating the problem in Lean

January 19, 2025

DeepSeek-V3 Explained: A Deep Dive into the Next-Generation AI Model

January 18, 2025

Artificial Intelligence (AI) is advancing at an unprecedented pace, and the DeepSeek-V3 model is at the forefront of this revolution. As…

Foundations of Large Language Models

January 17, 2025

Category Theory: Lecture Notes and Online Books

January 10, 2025

Why Futhark?

January 9, 2025

A high-performance and high-level purely functional data-parallel array programming language that can execute on the GPU and CPU.

Hesabım - Pozitif Teknoloji

January 6, 2025

Ödeme - Pozitif Teknoloji

January 6, 2025

*Lütfen açıklama kısmına sipariş numaranızı giriniz, Sipariş numarası yazılmayan havale işlemlerinde ki gecikmelerden firmamız sorumlu değildir.

Clear cache x app ios

January 4, 2025

Any way to delete the cache or app data on iphone? - RedditJul 26, 2023X app taking up 1.

2024

547 bookmarks

Bloom filters debunked: Dispelling 30 Years of bad math with Coq!

December 27, 2024

DeepSeek-V3/DeepSeek_V3.pdf at main · deepseek-ai/DeepSeek-V3

December 26, 2024

by Marcus Hutter and David Quarel and Elliot Catt

December 24, 2024

The book can be ordered from amazon. com / co.

Deepseek: The Quiet Giant Leading China’s AI Race

December 24, 2024

Annotated translation of its CEO's deepest interview

The Double-E Infix Expression Parsing Method

December 23, 2024

Topic in Programming Models

Demystifying Debuggers, Part 2: The Anatomy Of A Running Program

December 23, 2024

On the concepts involved in a running program. What happens, exactly, when you double click an executable file, or launch it from the command line, and it begins to execute?

Towards a Categorical Foundation of Deep Learning: A Survey

December 22, 2024

Soft question: Deep learning and higher categories

December 22, 2024

Recently, I have stumbled upon certain articles and lecture videos that use category theory to explain certain aspects of machine learning or deep learning (e.g. Cats for AI and the paper An enriched

Algebraic Databases

December 22, 2024

Categorical Databases

December 22, 2024

walter

December 22, 2024

FPGAs for Software Engineers 0: The Basics

December 22, 2024

A brief introduction to FPGAs, Verilog and simulation

Data-Oriented Design

December 19, 2024

Data-Oriented Design

A note about "The Humane Representation of Thought"

December 17, 2024

A year and a half ago, on a plane, I wrote An Ill-Advised Personal Note about "Media for Thinking the Unthinkable".

BLT__Patches_Scale_Better_Than_Tokens

December 17, 2024

On Ousterhout’s Dichotomy Oct 6, 2024

December 17, 2024

The categorical abstract machine

December 17, 2024

What is the "question" that programming language theory is trying to answer?

December 16, 2024

Introducing Limbo: A complete rewrite of SQLite in Rust

December 16, 2024

we forked SQLite with the libSQL project. What would it be like if we just rewrote it?

TLA+ is hard to learn

December 16, 2024

I’m a fan of the formal specification language TLA+. With TLA+, you can build models of programs or systems, which helps to reason about their behavior. TLA+ is particularly useful for reason…

How hard is constraint programming?

December 16, 2024

Writing code using the Z3 SMT solver is different from typical programming, due to mixed programming models--not unlike CUDA for GPUs. Here's what to expect.

Fundamental Components of Deep Learning: A category-theoretic approach

December 16, 2024

Geeks, MOPs, and sociopaths in subculture evolution

December 16, 2024

How muggles and sociopaths invade and undermine creative subcultures; and how to stop them.

Advanced programming languages

December 16, 2024

Students often ask for a recommendation on what language they should learn next.

Omens of exceptional talent

December 6, 2024

An Introduction to Current Theories of Consciousness

December 6, 2024

Being the (Pareto) Best in the World

December 6, 2024

John Wentworth argues that becoming one of the best in the world at *one* specific skill is hard, but it's not as hard to become the best in the worl…

Greg Yang

December 5, 2024

I am currently developing a framework called Tensor Programs for understanding large neural networks.

Some questions

December 4, 2024

Google's approach to email

A Century of Mathematics in America, Part I

December 4, 2024

Fastest contributed programs, grouped by programming language implementation

December 3, 2024

December 3, 2024

Okay, you wanna know what a topos is? First I'll give you a hand-wavy vague explanation, then an actual definition, then a few consequences of this definition, and then some examples.

context

December 3, 2024

Proof Explorer

December 3, 2024

An Invitation to Applied Category Theory

December 3, 2024

Abstract page for arXiv paper 1803.05316: Seven Sketches in Compositionality: An Invitation to Applied Category Theory

An Invitation to Applied Category Theory

December 3, 2024

Cambridge Core - Programming Languages and Applied Logic - An Invitation to Applied Category Theory

Introducing io_uring_spawn

December 2, 2024

The traditional mechanism for launching a program in a new process on Unix systems—forking and execing—has been with us for decades, but it is not really the most efficient of operations.

November 25, 2024

This is a website, which means it sometimes goes offline

Coalescence: making LLM inference 5x faster

November 24, 2024

þÿClassics in the History of Psychology -- Miller (1956)

November 21, 2024

``You and Your Research''

November 20, 2024

At a seminar in the Bell Communications Research Colloquia Series, Dr. Richard W.

Algorithms for Modern Hardware

November 19, 2024

Creating enums at comptime

November 18, 2024

Using zig's @Type to dynamically create enums at comptime

Zig's new declaration literals

November 18, 2024

A look at Zig's new declaration literals

Zig's (.{}){} syntax

November 18, 2024

November 17, 2024

The CPU vendors have been trying for a lot of time to exploit as much parallelism as they can and the introduction of vector instructions is one way to go.

Tell the Compiler What You Know

November 17, 2024

Compilers a lot of times use magic to uncover hidden mysteries of your program and optimize it aggressively.

Compiler Optimization in a Language you Can Understand

November 17, 2024

In this article, I'll explain compiler optimizations through a series of examples, focusing on what compilers do.

How Target-Independent is Your IR?

November 17, 2024

An esoteric exploration on the target independence of compiler IRs.

Bibliopolis-Book-retypeset-1984

November 12, 2024

Numerical Recipes

November 11, 2024

We are Numerical Recipes, one of the oldest continuously operating sites on the Internet.

Unpacking Intuition

November 10, 2024

For Beginners

November 9, 2024

Occasional writings about Haskell.

Oasis: A Universe in a Transformer

October 31, 2024

Generating Worlds in Realtime

A Fat Pointer Library

October 27, 2024

libCello Official Website

The Basics

October 27, 2024

Here’s what I consider to be the basics.

TCP Server in Zig - Part 5a - Poll

October 15, 2024

Using non-blocking sockets and poll to improve the scalability of our system.

Your Home Timeline

October 9, 2024

September 30, 2024

If you want to get a job as a software witch, you’re going to have to pass a whiteboard interview.

Hexing the technical interview

September 30, 2024

Nine Rules for SIMD Acceleration of Your Rust Code (Part 1)

September 23, 2024

General Lessons from Boosting Data Ingestion in the range-set-blaze Crate by 7x

Conscious exotica

September 21, 2024

From algorithms to aliens, could humans ever understand minds that are radically unlike our own?

B-trees and database indexes

September 13, 2024

B-trees are used by many modern DBMSs. Learn how they work, how databases use them, and how your choice of primary key can affect index performance.

Safe C++

September 13, 2024

Over the past two years, the United States Government has been issuing warnings about memory-unsafe programming languages with increasing urgency.

Tutorial on Diffusion Models for Imaging and Vision

September 10, 2024

Async Rust can be a pleasure to work with (without `Send + Sync + 'static`)

September 9, 2024

Async Rust is powerful. And it can be a pain to work with (and learn). Async Rust can be a pleasure to work with, though, if we can do it without `Send + Sync + 'static`.

The Perfect Plan

September 3, 2024

Too often do we obsess over the perfect plan to chase our dreams, resulting in analysis paralysis. Instead of being stuck in this limbo, I've made the perfect plan for anyone to chase their dreams.

August 21, 2024

`zig cc`: a Powerful Drop-In Replacement for GCC/Clang

August 15, 2024

If you have heard of Zig before, you may know it as a promising new programming language which is ambitiously trying to overthrow C as the de-facto systems language.

Zig Build System

August 15, 2024

The fundamental commands zig build-exe, zig build-lib, zig build-obj, and zig test are often sufficient.

Resources for Amateur Compiler Writers

August 14, 2024

I know complete pans of the literature are left out, but this is a page for amateur compiler writers. Anything that I did not find practical is not listed here.

MattPD/cpplinks: A categorized list of C++ resources.

August 14, 2024

A categorized list of C++ resources. Contribute to MattPD/cpplinks development by creating an account on GitHub.

August 3, 2024

There are roughly three classes of language features: Features that the language is effectively designed around, such that you can't add it after the fact....

Google’s Fully Homomorphic Encryption Compiler — A Primer

August 2, 2024

Back in May of 2022 I transferred teams at Google to work on Fully Homomorphic Encryption (newsletter announcement). Since then I’ve been working on a variety of projects in the space, includ…

Will I be able to access proprietary platform APIs (e.g. Android / iOS)?

August 1, 2024

The kind of binary format being considered for WebAssembly can be natively decoded much faster than JavaScript can be parsed (experiments show more than 20× faster).

The future of Clang-based tooling

August 1, 2024

Fast Multidimensional Matrix Multiplication on CPU from Scratch

Kernel Programming Guide

Pointers

July 15, 2024

Emulator 101

July 15, 2024

A detailed, step by step guide to writing an emulator

Data Compression Explained

July 15, 2024

Twitter's Recommendation Algorithm

July 14, 2024

Programming languages resources

July 13, 2024

3D Math Primer for Graphics and Game Development

July 12, 2024

Welcome to OpenGL

July 12, 2024

WebGPU Fundamentals

July 12, 2024

An opinionated beginner’s guide to Haskell in mid-2019

July 12, 2024

Are tagged unions overrated?

July 12, 2024

C++ Core Guidelines

July 12, 2024

What every systems programmer should know about concurrency

July 11, 2024

compiler_construction

July 11, 2024

How do we tell truths that might hurt?

July 11, 2024

The next fifty years

July 11, 2024

Recommender Systems: A Primer

July 10, 2024

http client in the standard library · Issue #2007 · ziglang/zig

July 10, 2024

Introduction to Compilers and Language Design

July 10, 2024

Bare Metal Zig

July 10, 2024

Comparing SIMD on x86-64 and arm64

July 10, 2024

Compiler Optimizations Are Hard Because They Forget

July 10, 2024

C Isn't A Programming Language Anymore

July 10, 2024

Writing a C Compiler, Part 1

July 9, 2024

GitHub - DoctorWkt/acwj: A Compiler Writing Journey

July 9, 2024

A new JIT engine for PHP-8.4/9

July 9, 2024

Unknown

July 9, 2024

Introduction 2016 NUMA Deep Dive Series

July 9, 2024

von Neumann architecture - Wikipedia

July 9, 2024

Compiling tree transforms to operate on packed representations

July 8, 2024

Pipelines Support Vectorized, Point-Free, and Imperative Style

July 8, 2024

Entering text in the terminal is complicated

July 8, 2024

What happens when you start a process on Linux?

July 8, 2024

Debug your programs like they're closed source!

July 8, 2024

How I got better at debugging

July 8, 2024

Media Page Under Construction

July 8, 2024

Infographics: Operation Costs in CPU Clock Cycles

July 8, 2024

Handles are the better pointers

July 8, 2024

You're Not Sick of Programming

July 8, 2024

Zig Bare Metal Programming on STM32F103 — Booting up

July 8, 2024

OWASP Top Ten

July 7, 2024

Introduction

July 7, 2024

The Copenhagen Book

July 7, 2024

Undefined Behavior deserves a better reputation

July 6, 2024

KHM+15

July 6, 2024

Learning LLVM (Part-1) - Writing a simple LLVM pass

July 5, 2024

Some Were Meant for C

July 5, 2024

Xv6, a simple Unix-like teaching operating system

July 5, 2024

C Is Not a Low-level Language

July 5, 2024

Should you learn C to "learn how the computer works"?

July 5, 2024

A Guide to Undefined Behavior in C and C++, Part 1

July 5, 2024

Using neural nets to recognize handwritten digits

July 5, 2024

When Network is Faster than Cache

July 5, 2024

John Carmack on Functional Programming in C++

July 5, 2024

Zig-style generics are not well-suited for most languages

July 5, 2024

WebGL2 vs WebGL1

July 4, 2024

WebGL How It Works

July 4, 2024

The_Night_Watch

July 4, 2024

FreeType

July 4, 2024

A Freestanding Rust Binary

July 3, 2024

Manually linking Rust binaries to support out-of-tree LLVM passes

July 3, 2024

The Rust Reference

July 3, 2024

Rust Compiler Development Guide

July 3, 2024

How to speed up the Rust compiler one last time

July 3, 2024

How to speed up the Rust compiler in March 2024

July 3, 2024

Zig Bits 0x4: Building an HTTP client/server from scratch

July 3, 2024

Do We Really Need A Link Step?

July 3, 2024

Death Note: L, Anonymity & Eluding Entropy

July 2, 2024

jamiebuilds/the-super-tiny-compiler: :snowman: Possibly the smallest compiler ever

July 2, 2024

5 Days to Virtualization: A Series on Hypervisor Development

July 2, 2024

In-depth analysis on Valorant’s Guarded Regions

July 2, 2024

Exploit Development: No Code Execution? No Problem! Living The Age of VBS, HVCI, and Kernel CFG

July 2, 2024

Reader

July 2, 2024

CheerpX versus WebContainers

July 2, 2024

Creating a Rootkit to Learn C

July 2, 2024

Picsart-AI-Research/LIVE-Layerwise-Image-Vectorization: [CVPR 2022 Oral] Towards Layer-wise Image Vectorization

July 1, 2024

Udacity CS344: Intro to Parallel Programming

July 1, 2024

CS 361: Systems Programming

July 1, 2024

Resolving Rust Symbols

July 1, 2024

When FFI Function Calls Beat Native C

July 1, 2024

Cap'n Proto, FlatBuffers, and SBE

July 1, 2024

A Database Without Dynamic Memory Allocation

July 1, 2024

Wizard Zines Collection!

July 1, 2024

Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0

July 1, 2024

Problems of C, and how Zig addresses them

July 1, 2024

How to use hash map contexts to save memory when doing a string table

June 30, 2024

resume.txt

June 30, 2024

Leslie Lamport

June 28, 2024

Indices and tables

June 27, 2024

448997590_1496256481254967_2304975057370160015_n

June 27, 2024

Bare Bones

June 25, 2024

The Graphics Codex

June 24, 2024

[2305.13009] Textually Pretrained Speech Language Models

June 24, 2024

Notes on partial borrows

June 24, 2024

Dioxus Labs + “High-level Rust”

June 24, 2024

Compile-Time Configuration For Zig Libraries

June 24, 2024

Generics

June 24, 2024

Zig's HashMap - Part 1

June 24, 2024

Zig Parser

June 24, 2024

Copying Better: How To Acquire The Tacit Knowledge of Experts

June 24, 2024

Causal ordering

June 24, 2024

Assorted thoughts on zig (and rust)

June 24, 2024

Columnar kernels in go?

Understanding_Machine_Learning_-_From_Theory_to_Algorithms

June 17, 2024

I'm sorry, but there is no content provided for me to summarize. If you provide me with the specific content or information you would like summarized, I would be happy to help.

UB Might Be a Wrong Term for Newer Languages Apr 2, 2023

June 17, 2024

What Every C Programmer Should Know About Undefined Behavior #1/3

June 17, 2024

The Rustonomicon

June 16, 2024

chrono-Compatible Low-Level Date Algorithms

June 15, 2024

Step-by-Step Diffusion: An Elementary Tutorial

This text provides information about a resource related to RISC-V programming. The ISBN number for this resource is 978-65-00-15811-3. It is authored by riscv-programming.org.

Microsoft PowerPoint - SRAM Architecture

May 28, 2024

MLIR: A Compiler Infrastructure for the End of Moore's Law

May 27, 2024

MLIR — Getting Started

May 27, 2024

The text is a guide titled "MLIR — Getting Started" by Math ∩ Programming available on www.jeremykun.com.

Chapter 2 Basics of SIMD Programming

May 27, 2024

Matrix multiplication in Mojo

May 27, 2024

The text discusses matrix multiplication in Mojo. It is written by modular.com and can be found on docs.modular.com.

Matrix Multiplication on CPU

May 27, 2024

The text is about matrix multiplication on a CPU. The author is Marek Kolodziej and the domain is marek.ai.

How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

April 25, 2024

þÿAn Infinitely Large Napkin

I'm sorry, but there is no content provided to summarize. If you have any text or information you would like me to summarize, please provide it so I can assist you.

Information

March 6, 2024

Sequence to Sequence Learning with Neural Networks

March 6, 2024

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

March 6, 2024

How to round to 2 decimals with Python? [duplicate]

March 5, 2024

Rounding floats with f-string [duplicate]

Pruning vs Quantization: Which is Better?

January 10, 2024

mlx-examples/lora at main · ml-explore/mlx-examples · GitHub

January 10, 2024

Mixtral of Experts

January 10, 2024

Paper page - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

January 10, 2024

The content is a set of instructions on how to cite a specific URL (arxiv.org/abs/2401.01335) in three different types of README.md files, in order to create links from those pages.