fmri-nsd-fwrf

Model Summary

Modality	fMRI
Training Dataset	Natural Scenes Dataset (NSD) (subject-native volume space)
Species	Human
Stimuli	Images
Model Type	Feature-weighted receptive field (fwRF)
Creator	Alessandro Gifford

Description

These encoding models consist in convolutional neural networks trained end-to-end to predict fMRI responses from input images using the feature-weighted receptive field (fwRF) (St-Yves & Naselaris, 2018).

The encoding models were trained on the Natural Scenes Dataset (NSD) (Allen et al., 2022), 7T fMRI responses of 8 subjects to 73k natural scenes coming from the COCO dataset (Lin et al., 2014). One encoding model was trained for each NSD subject, and for each of 23 ROIs overlaying visual cortex. For detailed information on these ROIs, and on how they were selected, refer to the NSD paper and data manual.

Preprocessing. The encoding models are trained on NSD’s subject-native volume data in “func1pt8mm” space, from the “betas_fithrf_GLMdenoise_RR” preprocessing version. Note that the NSD data were z-scored at each scan session, and as a consequence the in silico fMRI responses generated by the encoding models also live in z-scored space.

Model training partition. fMRI responses for up to 9,000 non-shared images (i.e., the images uniquely seen by each subject during the NSD experiment).

Model validation partition. fMRI responses for up to 485/1,000 shared images (i.e., the 485 shared images that not all subjects saw for up to three times during the NSD experiment).

Model testing partition. fMRI responses for 515/1,000 shared images (i.e., the 515 images that each subject saw for exactly three times during the NSD experiment).

Metadata

Note

Metadata files are generated separately for each ROI, containing only voxels within that region.

fmri

ncsnr : (n_voxels,) - Noise-ceiling signal-to-noise ratio per voxel (ROI-specific)

roi_mask_volume : (81, 104, 83) - Binary mask defining voxel locations in volume space for this ROI

fmri_affine : (4, 4) - Affine transformation matrix for volume-to-world coordinate mapping

encoding_models

r2 : (n_voxels,) - R² scores per voxel

noise_ceiling : (n_voxels,) - Noise ceiling per voxel (max explainable variance)

explained_variance : (n_voxels,) - Percentage of noise ceiling explained by model

train_img_num : (9000,) - Image indices used for training

val_img_num : (485,) - Image indices used for validation

test_img_num : (515,) - Image indices used for testing

Input

Type	`numpy.ndarray`
Shape	`['batch_size', 3, 'height', 'width']`
Description	The input should be a batch of RGB images.
Constraints	Image values should be integers in range [0, 255]. Image dimensions (height, width) should be equal (square). Minimum recommended image size: 224×224 pixels.

Output

Type	`numpy.ndarray`
Shape	`['batch_size', 'n_voxels']`
Description	The output is a 2D array containing in silico fMRI responses. The second dimension (n_voxels) corresponds to the number of voxels in the selected ROI, which varies by ROI and subject. For subject 1, the number of voxels per ROI is as follows: - V1: 1,350 - V2: 1,433 - V3: 1,187 - hV4: 687 - EBA: 2,971 - FBA-2: 430 - OFA: 355 - FFA-1: 484 - FFA-2: 310 - PPA: 1,033 - RSC: 566 - OPA: 1,611 - OWFA: 464 - VWFA-1: 773 - VWFA-2: 505 - mfs-words: 165 - early: 5,917 - midventral: 986 - midlateral: 834 - midparietal: 950 - parietal: 3,548 - lateral: 7,799 - ventral: 7,604
Dimensions	batch_size: Number of stimuli in the batch n_voxels: Number of voxels in the selected ROI, varies by ROI and subject.

Parameters

Parameters used in `get_encoding_model`

This function loads the encoding model.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-nsd-fwrf Example: “fmri-nsd-fwrf”
subject	Type: int Required: Yes Description: Subject ID from the NSD dataset (1-8). Valid Values: 1, 2, 3, 4, 5, 6, 7, 8 Example: 1
selection	Type: dict Required: Yes Description: Specifies which outputs to include in the model responses. Properties: roi Type: str Description: Region of Interest (ROI) for voxel prediction. Early visual areas (V1-V3), category-selective regions (EBA, FFA, etc.), or composite regions (lateral, ventral). Valid values: “V1”, “V2”, “V3”, “hV4”, “EBA”, “FBA-2”, “OFA”, “FFA-1”, “FFA-2”, “PPA”, “RSC”, “OPA”, “OWFA”, “VWFA-1”, “VWFA-2”, “mfs-words”, “early”, “midventral”, “midlateral”, “midparietal”, “parietal”, “lateral”, “ventral” Example: V1
device	Type: str Required: No Description: Device to run the model on. ‘auto’ will use CUDA if available, otherwise CPU. Valid Values: “cpu”, “cuda”, “auto” Example: “auto”

Parameters used in `encode`

This function generates in silico neural responses using the encoding model previously loaded.

model	Type: BaseModelInterface Required: Yes Description: An instantiated and loaded encoding model.
stimulus	Type: numpy.ndarray Required: Yes Description: A batch of RGB images to be encoded. Images should be in integer format with values in the range [0, 255], and square dimensions (e.g. 224×224). Example: “An array of shape [100, 3, 224, 224] representing 100 RGB images.”
return_metadata	Type: bool Required: No Description: Whether to return the encoding model’s metadata together with the in silico neural resposnes. Example: True
show_progress	Type: bool Required: No Description: Whether to show a progress bar during encoding (for large batches). Example: True

Parameters used in `get_model_metadata`

This function loads the encoding model’s metadata without having to load the model itself.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-nsd-fwrf Example: “fmri-nsd-fwrf”
subject	Type: int Required: Yes Description: Subject ID from the NSD dataset (1-8). Valid Values: 1, 2, 3, 4, 5, 6, 7, 8 Example: 1

Performance

Accuracy Plots (AWS directory):

brain-encoding-response-generator/encoding_models/modality-fmri/train_dataset-nsd/model-fwrf/encoding_models_accuracy

Example Usage

from berg import BERG

# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")

# Load the model
model = berg.get_encoding_model(
    "fmri-nsd-fwrf",
    subject=1,
    selection={
        "roi": "V1"
    }
)

# Prepare the stimulus images
# Image shape should be [batch_size, 3 RGB channels, height, width]
stimulus = np.random.randint(0, 255, (100, 3, 256, 256))

# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
    model,
    stimulus,
    show_progress=True
)

# The in silico fMRI responses will be a numpy.ndarray of shape:
# ['batch_size', 'n_voxels']
# where:
# - n_voxels: Number of voxels in the selected ROI, varies by ROI and subject.

# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
    model,
    stimulus,
    return_metadata=True
)

# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
    "fmri-nsd-fwrf",
    subject=1
)

References

Model building code: https://github.com/gifale95/BERG/tree/main/berg_creation_code
NSD paper (Allen et al., 2022): https://doi.org/10.1038/s41593-021-00962-x
fwRF model (St-Yves et al., 2018): https://doi.org/10.1016/j.neuroimage.2017.06.035
COCO dataset (Lin et al., 2014): https://cocodataset.org/#home