Deep learning for NeuroImaging in Python.

Note

This page is a reference documentation. It only explains the class signature, and not how to use it. Please refer to the gallery for the big picture.

class nidl.losses.yaware_infonce.YAwareInfoNCE(kernel: str = 'gaussian', bandwidth: float | list[float] | ndarray | KernelMetric = 1.0, temperature: float = 0.1)[source]¶

Bases: Module

Implementation of the y-Aware InfoNCE loss [R34].

Compute the y-Aware InfoNCE loss, which integrates auxiliary information into contrastive learning by weighting sample pairs.

Given a mini-batch of size $n$ , two embeddings $z_1$ and $z_2$ representing two views of the same samples and a weighting matrix $W$ computed using auxiliary variables $y$ , the loss is:

$\ell = -\frac{1}{n} \sum_{i,j} \frac{W_{i,j}}{\sum_{k=1}^{n} W_{i, k}} \log \frac{\exp(\text{sim}(z_1^{i}, z_2^{j}) / \tau)} {\sum_{k=1}^{n} \exp(\text{sim}(z_1^{i}, z_2^{k}) / \tau)}$

where $sim$ is the cosine similarity and $\tau$ is the temperature.

$W$ is computed with a kernel $K$ (e.g. Gaussian) and bandwidth $H$ as:

$W_{i,j} = K\left( H^{-\frac{1}{2}} (y_i-y_j) \right)$

Parameters:

kernel : str in {‘gaussian’, ‘epanechnikov’, ‘exponential’, ‘linear’, ‘cosine’}, default=’gaussian’

Kernel to compute the weighting matrix between auxiliary variables. See PhD thesis, Dufumier 2022 page 94-95.

bandwidth : Union[float, int, List[float], array, KernelMetric], default=1.0

The method used to calculate the bandwidth ( $\sigma^2$ in [R34]) between auxiliary variables:

If bandwidth is a scalar (int or float), it sets the bandwidth to a diagnonal matrix with equal values.

If bandwidth is a 1d array, it sets the bandwidth to a diagonal matrix and it must be of size equal to the number of features in y.

If bandwidth is a 2d array, it must be of shape (n_features, n_features) where n_features is the number of features in y.

If bandwidth is KernelMetric, it uses the pairwise method to compute the similarity matrix between auxiliary variables.

temperature : float, default=0.1

Temperature used to scale the dot-product between embedded vectors

References

[R34] (1,2,3)

Dufumier, B., et al., “Contrastive learning with continuous proxy meta-data for 3D MRI classification.” MICCAI, 2021. https://arxiv.org/abs/2106.08808

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(z1: Tensor, z2: Tensor, labels: Tensor | None = None)[source]¶

Parameters:

z1 : torch.Tensor of shape (batch_size, n_features)

First embedded view.

z2 : torch.Tensor of shape (batch_size, n_features)

Second embedded view.

labels : Optional[torch.Tensor] of shape (batch_size, n_labels)

Auxiliary variables associated to the input data. If None, the standard InfoNCE loss is returned.

Returns:

loss : torch.Tensor

The y-Aware InfoNCE loss computed between z1 and z2.