We introduce xVal, a new number encoding scheme for LLMs. Using xVal with a modified number inference method makes LLMs continuous function approximators. This makes them have a better inductive bias for data analysis in scientific domains.
We introduce Multiple Physics Pretraining, a new approach for developing large tuneable physical surrogate models. Our approach uses a built-in normalization and embedding scheme to enable learning multiple physical dynamics with a single model.
We present a self-supervised learning strategy that bridges diverse observational modalities in astrophysics. By aligning cross-modal representations of galaxies in a shared space, we are able to perform cross-modal look-up and competitive zero-shot predictions on downstream tasks.