Bookmarks

tt-metal/METALIUM_GUIDE.md at main · tenstorrent/tt-metal · GitHub

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model. - tenstorrent/tt-metal

Putting the “You” in CPU

Curious exactly what happens when you run a program on your computer? Learn how multiprocessing works, what system calls really are, how computers manage memory with hardware interrupts, and how Linux loads executables.

Efficient n-states on x86 systems

The text discusses how to efficiently handle control flow in x86 systems when a flag can have multiple states beyond true and false. It explains how to use condition codes, such as testing for zero and parity, to minimize the number of instructions needed for these tests. Additionally, it touches on the challenges and limitations of using inline assembly for optimization in C programming.

Optimizing subroutines in assembly language

Optimizing subroutines in assembly language involves various techniques such as using inline assembly in a C++ compiler, separating code using MMX registers from code using ST registers, and understanding different register sizes and memory operands. It is important to consider the use of instruction prefixes, intrinsic functions for vector operations, and accessing class and structure members efficiently. Additionally, preventing false dependences, aligning loop and subroutine entries, and optimizing instruction sizes can improve performance. However, it is crucial to note that these optimizations are processor-specific and may vary depending on the target platform.

Brian Robert Callahan

This blog post starts a series on creating programs that demystify how programs work. The first program is a disassembler that reads bytecode and converts it into assembly language, while a future post will cover creating an assembler. The disassembler uses a table of mnemonics and instruction sizes to print out the corresponding assembly instructions from bytecode.

Infographics: Operation Costs in CPU Clock Cycles

The text discusses the operation costs in CPU clock cycles for different types of operations, including simple operations, floating-point operations, and vector operations. It highlights that memory involvement can significantly impact operation costs, with some operations taking as little as 1 CPU cycle. Different CPU architectures and types of operations can result in varying costs, with some operations requiring specialized CPU support to work efficiently.

Should you learn C to "learn how the computer works"?

The author discusses whether learning C is necessary to understand how computers work, ultimately concluding that C is not a direct representation of computer operations. Learning C can still be beneficial for understanding computing concepts and history, but it operates within a virtual machine and abstracts certain hardware details. By learning C, you can gain insight into the relationship between programming languages, hardware, and the historical development of computing.

An Introduction to Assembly Programming with RISC-V

This text provides information about a resource related to RISC-V programming. The ISBN number for this resource is 978-65-00-15811-3. It is authored by riscv-programming.org.

Subcategories