Article Blog Image

How Do Transformers Count in Context?

May 30, 2024

We introduce the Contextual Counting task, a new toy problem aimed at exploring interpretability of Transformer models in quantitative domains. We compare the performance of causal and non-causal models with different position codes and find causal models with RoPE and NoPE significantly outperform other configurations. We provide detailed explanation of how the circuits function and what makes them succeed or fail in generalization to out-of-distribution samples.
Article Blog Image

xVal: A continuous number encoding for LLMs

Oct 09, 2023

We introduce xVal, a new number encoding scheme for LLMs. Using xVal with a modified number inference method makes LLMs continuous function approximators. This makes them have a better inductive bias for data analysis in scientific domains.