Bookmarks

A WebAssembly compiler that fits in a tweet

Starting with a 192-byte one-liner that implements a Reverse Polish Notation arithmetic compiler, we'll work backward to transform it into readable JavaScript by removing one code golf trick at a time

Unveiling_DeepSeek.pdf

successful modifications since its inception, let alone large-scale validation.

Gemini: A Family of Highly Capable Multimodal Models

This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.

An Invitation to Applied Category Theory

Abstract page for arXiv paper 1803.05316: Seven Sketches in Compositionality: An Invitation to Applied Category Theory

The Basics

Here’s what I consider to be the basics.

Google’s Fully Homomorphic Encryption Compiler — A Primer

Back in May of 2022 I transferred teams at Google to work on Fully Homomorphic Encryption (newsletter announcement). Since then I’ve been working on a variety of projects in the space, includ…

Baby Steps to a C Compiler

Writing a simple compiler can help you understand how computers work. Start with a minimal project that compiles a small subset of a language, and then gradually add more features. This approach makes learning about compilers and programming enjoyable and rewarding.

Nanosystems

This text is about a book called "Nanosystems" by K. Eric Drexler, which is considered groundbreaking in the field of molecular nanotechnology. The book explains how to create manufacturing systems at the molecular level and discusses the significant impact nanotechnology will have on various industries. Experts praise the book for providing a foundation for future research in molecular systems engineering and molecular manufacturing.

GitHub - sirupsen/napkin-math: Techniques and numbers for estimating system's performance from first-principles

The project "Napkin Math" aims to provide resources and techniques to estimate system performance quickly and accurately. It includes examples like estimating memory reading speed and storage costs for applications. The best way to learn this skill is through practical application, with the option to subscribe for regular practice problems. Detailed numbers and cost estimates are provided, along with compression ratios and techniques to simplify calculations. The project encourages user participation to enhance and refine the provided data and tools for napkin math calculations.

Speech-to-text models

Speech-to-text AI enhances communication and accessibility by transcribing spoken words into text accurately and efficiently. Machine learning and AI advancements have significantly improved the accuracy and adaptability of speech-to-text systems. These technologies open up new possibilities for inclusive and effective communication across various industries.

Latent Interfaces

In a career shift, the author is launching Latent Interfaces to apply expertise in design, prototyping, and development to complex data challenges. They share insights into a genomic data project, emphasizing the importance of Python skills alongside JavaScript. The document showcases the creation of intuitive data interfaces and the design process involving both digital and physical tools. Additionally, the author discusses the significance of well-designed APIs like StabilityAI and the potential for future collaborations in data visualization projects.

gemini_v1_5_report

Gemini 1.5 Pro is a highly compute-efficient multimodal model that can recall and reason over millions of tokens of context, including long documents, videos, and audio. It achieves near-perfect recall on long-context retrieval tasks and outperforms the state-of-the-art in long-document QA, long-video QA, and long-context ASR. Gemini 1.5 Pro also showcases surprising new capabilities, such as learning to translate a new language from a grammar manual. The model surpasses the previous Gemini 1.0 Pro and performs at a similar level to 1.0 Ultra on a wide range of benchmarks while requiring less compute to train.

GitHub - sst/demo-ai-app: Sample AI movies app built with ❍ Ion

This document provides an overview of the sst/demo-ai-app, a sample movies app built with Ion that demonstrates how to use AI in your apps using your own data. The app includes features such as tagging, related movies, and deep search using natural language. It utilizes the Vector component, which is based on Amazon Bedrock and allows for easy AI integration with your data. The document also highlights the advantages of Ion, including faster deployment and no stack limits. The app works by ingesting movie data from IMDB, generating embeddings, and storing them in a Vector database, which the Next.js app then retrieves.

From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models

LLMs (Large Language Models) have been enhanced with innovative prompting strategies and external tools, expanding their capabilities. However, integrating LLMs into conversational agents presents a challenge. This paper introduces RAISE, an enhanced version of the ReAct framework, which utilizes scratchpad and retrieved examples to augment the agent's capabilities. RAISE demonstrates superiority as a conversational agent in experiments conducted on a real estate dataset. The working memory of RAISE consists of conversation history, scratchpad, examples, and task trajectory. The paper also discusses the evaluation of agent performance and the core aspects of planning and Chain-of-Thought reasoning.

fastai/fastbook: The fastai book, published as Jupyter Notebooks

The fastai book, published as Jupyter Notebooks, provides an introduction to deep learning, fastai, and PyTorch. It is copyright Jeremy Howard and Sylvain Gugger, and a selection of chapters is available to read online. The notebooks in the repository are used for a MOOC and form the basis of the book, which is available for purchase. The code in the notebooks is covered by the GPL v3 license, while the other content is not licensed for redistribution or change. It is recommended to use Google Colab to access and work with the notebooks. If there are any contributions or citations, copyright is assigned to Jeremy Howard and Sylvain Gugger.

Subcategories