🔗 A Systematic Evaluation of Sample-Level Tokenization Strategies for MEG Foundation Models (Preprint DOI: 10.1101/2025.09.25.678554)
💡 Please email SungJun Cho at sungjun.cho@ndcn.ox.ac.uk or simply open a GitHub issue if you have any questions or concerns.
This repository contains all the scripts necessary to reproduce the analyses and figures presented in the manuscript. It is divided into three main directories.
| Directory | Description |
|---|---|
scripts |
Scripts for training the tokenizer and MEG-GPT models and conducting subsequent analyses. |
supplementary |
Scripts for additional data inspection, post-hoc analysis, and visualization. |
testing |
Scripts for debugging and testing. |
For detailed descriptions of the scripts in each directory, please consult the README file located within each respective folder.
In addition, the models directory contains configuration files specifying the hyperparameters used for all models trained in this work. Corresponding tables summarizing these hyperparameters are provided within the same directory.
This repository builds on the osl-foundation software package, which provides the MEG-GPT model and its associated tokenizers. To start, please install osl-foundation and set up its environment by following the installation guide here.
The scripts used in this paper rely on the following dependencies:
python==3.10.4
tensorflow==2.11.0
tensorflow-probability==0.19.0
osl-dynamics==2.1.8
osl-foundation==0.0.1
Once these steps are complete, you can clone or download this repository to your preferred directory, and you're ready to begin!
PyTorch version
For a PyTorch implementation of the tokenisers, please refer to the EphysTokenizer repository.
All scripts in this repository were executed on the Oxford Biomedical Research Computing (BMRC) servers. Our experiments were run using two NVIDIA GPUs (V100 or A100) with CUDA 11.7.0.
Copyright (c) 2026 SungJun Cho and OHBA Analysis Group. Cho2026_Tokenizer is a free and open-source software licensed under the MIT License.