Polymathic

Bridging the Gap Between Physical Numerical Simulations and Machine Learning: Introducing The Well

Dec 03, 2024

We release The Well, a large-scale collection of physics numerical simulations created with domain experts and formatted for a machine learning usage.

The Multimodal Universe: 100TB of Astronomical Scientific Data

Dec 03, 2024

100TB of cross-matched, standardized astronomy data that brings together images, spectra, and time-series data from leading surveys to accelerate machine learning breakthroughs.

AstroCLIP Update: A Cross-Modal Foundation Model for Galaxies

Jun 11, 2024

We release a significant update to the AstroCLIP model, which demonstrates superior performance on all previously tested downstream tasks and introduces the capacity to tackle a host of new problems.

How Do Transformers Count in Context?

May 30, 2024

We introduce the Contextual Counting task, a new toy problem aimed at exploring interpretability of Transformer models in quantitative domains. We compare the performance of causal and non-causal models with different position codes and find causal models with RoPE and NoPE significantly outperform other configurations. We provide detailed explanation of how the circuits function and what makes them succeed or fail in generalization to out-of-distribution samples.

xVal: A continuous number encoding for LLMs

Oct 09, 2023

We introduce xVal, a new number encoding scheme for LLMs. Using xVal with a modified number inference method makes LLMs continuous function approximators. This makes them have a better inductive bias for data analysis in scientific domains.

Accelerating Surrogate Model Development with Multiple Physics Pretraining

Oct 09, 2023

We introduce Multiple Physics Pretraining, a new approach for developing large tuneable physical surrogate models. Our approach uses a built-in normalization and embedding scheme to enable learning multiple physical dynamics with a single model.

AstroCLIP: Connecting Diverse Observational Modalities in Astrophysics

Oct 09, 2023

We present a self-supervised learning strategy that bridges diverse observational modalities in astrophysics. By aligning cross-modal representations of galaxies in a shared space, we are able to perform cross-modal look-up and competitive zero-shot predictions on downstream tasks.

Announcing Polymathic AI

Oct 09, 2023

Polymathic AI: A new initiative to advance science through multi-disciplinary AI