API リファレンス¶
ml-networksの完全なAPIリファレンスです。
モジュール一覧¶
共通モジュール¶
PyTorch (ml_networks.torch)¶
- レイヤー - 基本的なレイヤー(MLP、Conv、Attention、Transformerなど)
- ビジョン - ビジョン関連のモジュール(Encoder、Decoder、ConvNet、ResNet、ViTなど)
- 分布 - 分布関連のクラスと関数
- 損失関数 - 損失関数
- 活性化関数 - カスタム活性化関数
- UNet - 条件付きUNetクラス
- その他 - HyperNet、ContrastiveLearning、BaseModule、ProgressBarCallback
JAX (ml_networks.jax)¶
- JAX API - JAX(Flax NNX)実装のAPIリファレンス
主要なクラスと関数¶
レイヤー¶
MLPLayer ¶
Bases: LightningModule
Multi-layer perceptron layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Input dimension. |
required |
output_dim
|
int
|
Output dimension. |
required |
cfg
|
MLPConfig
|
|
required |
Examples:
>>> cfg = MLPConfig(
... hidden_dim=16,
... n_layers=3,
... output_activation="ReLU",
... linear_cfg=LinearConfig(
... activation="ReLU",
... norm="layer",
... norm_cfg={"eps": 1e-05, "elementwise_affine": True, "bias": True},
... dropout=0.1,
... norm_first=False,
... bias=True
... )
... )
>>> mlp = MLPLayer(32, 16, cfg)
>>> x = torch.randn(1, 32)
>>> output = mlp(x)
>>> output.shape
torch.Size([1, 16])
Source code in src/ml_networks/torch/layers.py
LinearNormActivation ¶
Bases: Module
Linear layer with normalization and activation, and dropouts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dim
|
int
|
Input dimension. |
required |
output_dim
|
int
|
Output dimension. |
required |
cfg
|
LinearConfig
|
Linear layer configuration. |
required |
References
LayerNorm: https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html RMSNorm: https://pytorch.org/docs/stable/generated/torch.nn.RMSNorm.html Linear: https://pytorch.org/docs/stable/generated/torch.nn.Linear.html Dropout: https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html
Examples:
>>> cfg = LinearConfig(
... activation="ReLU",
... norm="layer",
... norm_cfg={"eps": 1e-05, "elementwise_affine": True, "bias": True},
... dropout=0.1,
... norm_first=False,
... bias=True
... )
>>> linear = LinearNormActivation(32, 16, cfg)
>>> linear
LinearNormActivation(
(linear): Linear(in_features=32, out_features=16, bias=True)
(norm): LayerNorm((16,), eps=1e-05, elementwise_affine=True)
(activation): Activation(
(activation): ReLU()
)
(dropout): Dropout(p=0.1, inplace=False)
)
>>> x = torch.randn(1, 32)
>>> output = linear(x)
>>> output.shape
torch.Size([1, 16])
>>> cfg = LinearConfig(
... activation="SiGLU",
... norm="none",
... norm_cfg={},
... dropout=0.0,
... norm_first=True,
... bias=True
... )
>>> linear = LinearNormActivation(32, 16, cfg)
>>> # If activation includes "glu", linear output_dim is doubled to adjust actual output_dim.
>>> linear
LinearNormActivation(
(linear): Linear(in_features=32, out_features=32, bias=True)
(norm): Identity()
(activation): Activation(
(activation): SiGLU()
)
(dropout): Identity()
)
>>> x = torch.randn(1, 32)
>>> output = linear(x)
>>> output.shape
torch.Size([1, 16])
Source code in src/ml_networks/torch/layers.py
Attributes¶
linear
instance-attribute
¶
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (*, input_dim) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Output tensor of shape (*, output_dim) |
Source code in src/ml_networks/torch/layers.py
ConvNormActivation ¶
Bases: Module
Convolutional layer with normalization and activation, and dropouts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_channels
|
int
|
Input channels. |
required |
out_channels
|
int
|
Output channels. |
required |
cfg
|
ConvConfig
|
Convolutional layer configuration. |
required |
References
PixelShuffle: https://pytorch.org/docs/stable/generated/torch.nn.PixelShuffle.html PixelUnshuffle: https://pytorch.org/docs/stable/generated/torch.nn.PixelUnshuffle.html BatchNorm2d: https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html GroupNorm: https://pytorch.org/docs/stable/generated/torch.nn.GroupNorm.html LayerNorm: https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html InstanceNorm2d: https://pytorch.org/docs/stable/generated/torch.nn.InstanceNorm2d.html Conv2d: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html Dropout: https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html
Examples:
>>> cfg = ConvConfig(
... activation="ReLU",
... kernel_size=3,
... stride=1,
... padding=1,
... dilation=1,
... groups=1,
... bias=True,
... dropout=0.1,
... norm="batch",
... norm_cfg={"affine": True, "track_running_stats": True},
... scale_factor=0
... )
>>> conv = ConvNormActivation(3, 16, cfg)
>>> x = torch.randn(1, 3, 32, 32)
>>> output = conv(x)
>>> output.shape
torch.Size([1, 16, 32, 32])
>>> cfg = ConvConfig(
... activation="SiGLU",
... kernel_size=3,
... stride=1,
... padding=1,
... dilation=1,
... groups=1,
... bias=True,
... dropout=0.0,
... norm="none",
... norm_cfg={},
... scale_factor=2
... )
>>> conv = ConvNormActivation(3, 16, cfg)
>>> x = torch.randn(1, 3, 32, 32)
>>> output = conv(x)
>>> output.shape
torch.Size([1, 16, 64, 64])
Source code in src/ml_networks/torch/layers.py
Attributes¶
conv
instance-attribute
¶
conv = Conv2d(in_channels=in_channels, out_channels=out_channels_, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias, padding_mode=padding_mode)
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (B, in_channels, H, W) or (in_channels, H, W) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Output tensor of shape (B, out_channels, H', W') or (out_channels, H', W') |
H' and W' are calculated as follows:
|
|
H' = (H + 2*padding - dilation * (kernel_size - 1) - 1) // stride + 1
|
|
H' = H' * scale_factor if scale_factor > 0 else H' // abs(scale_factor) if scale_factor < 0 else H'
|
|
W' = (W + 2*padding - dilation * (kernel_size - 1) - 1) // stride + 1
|
|
W' = W' * scale_factor if scale_factor > 0 else W' // abs(scale_factor) if scale_factor < 0 else W'
|
|
Source code in src/ml_networks/torch/layers.py
ConvTransposeNormActivation ¶
Bases: Module
Transposed convolutional layer with normalization and activation, and dropouts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_channels
|
int
|
Input channels. |
required |
out_channels
|
int
|
Output channels. |
required |
cfg
|
ConvConfig
|
Convolutional layer configuration. |
required |
Examples:
>>> cfg = ConvConfig(
... activation="ReLU",
... kernel_size=3,
... stride=1,
... padding=1,
... output_padding=0,
... dilation=1,
... groups=1,
... bias=True,
... dropout=0.1,
... norm="batch",
... norm_cfg={"affine": True, "track_running_stats": True}
... )
>>> conv = ConvTransposeNormActivation(3, 16, cfg)
>>> x = torch.randn(1, 3, 32, 32)
>>> output = conv(x)
>>> output.shape
torch.Size([1, 16, 32, 32])
>>> cfg = ConvConfig(
... activation="SiGLU",
... kernel_size=3,
... stride=1,
... padding=1,
... output_padding=0,
... dilation=1,
... groups=1,
... bias=True,
... dropout=0.0,
... norm="none",
... norm_cfg={}
... )
>>> conv = ConvTransposeNormActivation(3, 16, cfg)
>>> x = torch.randn(1, 3, 32, 32)
>>> output = conv(x)
>>> output.shape
torch.Size([1, 16, 32, 32])
Source code in src/ml_networks/torch/layers.py
Attributes¶
conv
instance-attribute
¶
conv = ConvTranspose2d(in_channels, out_channels * 2 if 'glu' in lower() else out_channels, kernel_size, stride, padding, output_padding, groups, bias=bias, dilation=dilation)
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor of shape (B, in_channels, H, W) or (in_channels, H, W) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
Output tensor of shape (B, out_channels, H', W') or (out_channels, H', W') |
H' and W' are calculated as follows:
|
|
H' = (H - 1) * stride - 2 * padding + kernel_size + output_padding
|
|
W' = (W - 1) * stride - 2 * padding + kernel_size + output_padding
|
|
Source code in src/ml_networks/torch/layers.py
ビジョン¶
Encoder ¶
Bases: BaseModule
Encoder with various architectures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_dim
|
int | tuple[int, int, int]
|
Dimension of the feature tensor. If int, Encoder includes full connection layer to downsample the feature tensor. Otherwise, Encoder does not include full connection layer and directly process with backbone network. |
required |
obs_shape
|
tuple[int, int, int]
|
shape of the input tensor |
required |
backbone_cfg
|
ViTConfig | ConvNetConfig | ResNetConfig
|
configuration of the network |
required |
fc_cfg
|
MLPConfig | LinearConfig | SpatialSoftmaxConfig | None
|
configuration of the full connection layer. If feature_dim is tuple, fc_cfg is ignored. If feature_dim is int, fc_cfg must be provided. Default is None. |
None
|
Examples:
>>> feature_dim = 128
>>> obs_shape = (3, 64, 64)
>>> cfg = ConvNetConfig(
... channels=[16, 32, 64],
... conv_cfgs=[
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ]
... )
>>> fc_cfg = LinearConfig(
... activation="ReLU",
... bias=True
... )
>>> encoder = Encoder(feature_dim, obs_shape, cfg, fc_cfg)
>>> x = torch.randn(2, *obs_shape)
>>> y = encoder(x)
>>> y.shape
torch.Size([2, 128])
>>> encoder
Encoder(
(encoder): ConvNet(
(conv): Sequential(
(0): ConvNormActivation(
(conv): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(1): ConvNormActivation(
(conv): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(2): ConvNormActivation(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
)
)
(fc): Sequential(
(0): Flatten(start_dim=1, end_dim=-1)
(1): LinearNormActivation(
(linear): Linear(in_features=4096, out_features=128, bias=True)
(norm): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
)
)
Source code in src/ml_networks/torch/vision.py
127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 | |
Attributes¶
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
input tensor of shape (batch_size, *obs_shape) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
output tensor of shape (batch_size, *feature_dim) |
Source code in src/ml_networks/torch/vision.py
Decoder ¶
Bases: BaseModule
Decoder with various architectures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_dim
|
int | tuple[int, int, int]
|
dimension of the feature tensor, if int, Decoder includes full connection layer to upsample the feature tensor. Otherwise, Decoder does not include full connection layer and directly process with backbone network. |
required |
obs_shape
|
tuple[int, int, int]
|
shape of the output tensor |
required |
backbone_cfg
|
ConvNetConfig | ViTConfig | ResNetConfig
|
configuration of the network |
required |
fc_cfg
|
MLPConfig | LinearConfig | None
|
configuration of the full connection layer. If feature_dim is tuple, fc_cfg is ignored. If feature_dim is int, fc_cfg must be provided. Default is None. |
None
|
Examples:
>>> feature_dim = 128
>>> obs_shape = (3, 64, 64)
>>> cfg = ConvNetConfig(
... channels=[64, 32, 16],
... conv_cfgs=[
... ConvConfig(kernel_size=4, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=4, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=4, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ]
... )
>>> fc_cfg = MLPConfig(
... hidden_dim=256,
... n_layers=2,
... output_activation= "ReLU",
... linear_cfg= LinearConfig(
... activation= "ReLU",
... bias= True
... )
... )
>>> decoder = Decoder(feature_dim, obs_shape, cfg, fc_cfg)
>>> x = torch.randn(2, feature_dim)
>>> y = decoder(x)
>>> y.shape
torch.Size([2, 3, 64, 64])
>>> decoder
Decoder(
(fc): MLPLayer(
(dense): Sequential(
(0): LinearNormActivation(
(linear): Linear(in_features=128, out_features=256, bias=True)
(norm): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(1): LinearNormActivation(
(linear): Linear(in_features=256, out_features=256, bias=True)
(norm): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(2): LinearNormActivation(
(linear): Linear(in_features=256, out_features=1024, bias=True)
(norm): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
)
)
(decoder): ConvTranspose(
(first_conv): Conv2d(16, 64, kernel_size=(1, 1), stride=(1, 1))
(conv): Sequential(
(0): ConvTransposeNormActivation(
(conv): ConvTranspose2d(64, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(1): ConvTransposeNormActivation(
(conv): ConvTranspose2d(32, 16, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(2): ConvTransposeNormActivation(
(conv): ConvTranspose2d(16, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
)
)
)
Source code in src/ml_networks/torch/vision.py
Attributes¶
decoder
instance-attribute
¶
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
input tensor of shape (batch_size, *feature_dim) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
output tensor of shape (batch_size, *obs_shape) |
Source code in src/ml_networks/torch/vision.py
ConvNet ¶
Bases: Module
Convolutional Neural Network for Encoder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs_shape
|
tuple[int, int, int]
|
shape of input tensor |
required |
cfg
|
ConvNetConfig
|
configuration of the network |
required |
Examples:
>>> obs_shape = (3, 64, 64)
>>> cfg = ConvNetConfig(
... channels=[16, 32, 64],
... conv_cfgs=[
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ]
... )
>>> encoder = ConvNet(obs_shape, cfg)
>>> encoder
ConvNet(
(conv): Sequential(
(0): ConvNormActivation(
(conv): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(1): ConvNormActivation(
(conv): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
(2): ConvNormActivation(
(conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pixel_shuffle): Identity()
(activation): Activation(
(activation): ReLU()
)
(dropout): Identity()
)
)
)
>>> x = torch.randn(2, *obs_shape)
>>> y = encoder(x)
>>> y.shape
torch.Size([2, 64, 8, 8])
Source code in src/ml_networks/torch/vision.py
Attributes¶
conved_shape
property
¶
Get the shape of the output tensor after convolutional layers.
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
shape of the output tensor |
Examples:
>>> obs_shape = (3, 64, 64)
>>> cfg = ConvNetConfig(
... channels=[64, 32, 16],
... conv_cfgs=[
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ConvConfig(kernel_size=3, stride=2, padding=1, activation="ReLU", norm="batch", dropout=0.0),
... ]
... )
>>> encoder = ConvNet(obs_shape, cfg)
>>> encoder.conved_shape
(8, 8)
conved_size
property
¶
Get the size of the output tensor after convolutional layers.
Returns:
| Type | Description |
|---|---|
int
|
size of the output tensor |
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
input tensor of shape (batch_size, *obs_shape) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
output tensor of shape (batch_size, self.last_channel, *self.conved_shape) |
Source code in src/ml_networks/torch/vision.py
ResNetPixUnshuffle ¶
Bases: Module
ResNet with PixelUnshuffle for Encoder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs_shape
|
tuple[int, int, int]
|
shape of input tensor |
required |
cfg
|
ResNetConfig
|
configuration of the network |
required |
Examples:
>>> obs_shape = (3, 64, 64)
>>> cfg = ResNetConfig(
... conv_channel=64,
... conv_kernel=3,
... f_kernel=3,
... conv_activation="ReLU",
... out_activation="ReLU",
... n_res_blocks=2,
... scale_factor=2,
... n_scaling=3,
... norm="batch",
... norm_cfg={},
... dropout=0.0
... )
>>> encoder = ResNetPixUnshuffle(obs_shape, cfg)
>>> x = torch.randn(2, *obs_shape)
>>> y = encoder(x)
>>> y.shape
torch.Size([2, 64, 8, 8])
Source code in src/ml_networks/torch/vision.py
Attributes¶
conved_shape
property
¶
Get the shape of the output tensor after convolutional layers.
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
shape of the output tensor |
conved_size
property
¶
Get the size of the output tensor after convolutional layers.
Returns:
| Type | Description |
|---|---|
int
|
size of the output tensor |
Functions¶
forward ¶
Forward pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
input tensor of shape (batch_size, *obs_shape) |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
output tensor of shape (batch_size, self.last_channel, *self.conved_shape) |
Source code in src/ml_networks/torch/vision.py
分布¶
Distribution ¶
Bases: Module
A distribution function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
in_dim
|
int
|
Input dimension. |
required |
dist
|
Literal['normal', 'categorical', 'bernoulli']
|
Distribution type. |
required |
n_groups
|
int
|
Number of groups. Default is 1. This is used for the categorical and Bernoulli distributions. |
1
|
spherical
|
bool
|
Whether to project samples to the unit sphere. Default is False. This is used for the categorical and Bernoulli distributions. If True and dist=="categorical", the samples are projected from {0, 1} to {-1, 1}. If True and dist=="bernoulli", the samples are projected from {0, 1} to the unit sphere. refer to https://arxiv.org/abs/2406.07548 |
False
|
Examples:
>>> dist = Distribution(10, "normal")
>>> data = torch.randn(2, 20)
>>> posterior = dist(data)
>>> posterior.__class__.__name__
'NormalStoch'
>>> posterior.shape
NormalShape(mean=torch.Size([2, 10]), std=torch.Size([2, 10]), stoch=torch.Size([2, 10]))
>>> dist = Distribution(10, "categorical", n_groups=2)
>>> data = torch.randn(2, 10)
>>> posterior = dist(data)
>>> posterior.__class__.__name__
'CategoricalStoch'
>>> posterior.shape
CategoricalShape(logits=torch.Size([2, 2, 5]), probs=torch.Size([2, 2, 5]), stoch=torch.Size([2, 10]))
>>> dist = Distribution(10, "bernoulli", n_groups=2)
>>> data = torch.randn(2, 10)
>>> posterior = dist(data)
>>> posterior.__class__.__name__
'BernoulliStoch'
>>> posterior.shape
BernoulliShape(logits=torch.Size([2, 2, 5]), probs=torch.Size([2, 2, 5]), stoch=torch.Size([2, 10]))
Source code in src/ml_networks/torch/distributions.py
Attributes¶
Functions¶
bernoulli ¶
Source code in src/ml_networks/torch/distributions.py
categorical ¶
Source code in src/ml_networks/torch/distributions.py
deterministic_onehot ¶
Compute the one-hot vector by argmax.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input
|
Tensor
|
Input tensor. |
required |
Returns:
| Type | Description |
|---|---|
Tensor
|
One-hot vector. |
Examples:
>>> input = torch.arange(6).reshape(2, 3) / 5.0
>>> dist = Distribution(3, "categorical")
>>> onehot = dist.deterministic_onehot(input)
>>> onehot
tensor([[0., 0., 1.],
[0., 0., 1.]])
Source code in src/ml_networks/torch/distributions.py
forward ¶
Compute the posterior distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
Tensor
|
Input tensor. |
required |
deterministic
|
bool
|
Whether to use the deterministic mode. Default is False. if True and dist=="normal", the mean is returned. if True and dist=="categorical", the one-hot vector computed by argmax is returned. if True and dist=="bernoulli", 1 is returned if x > 0.5 or 0 is returned if x <= 0.5. |
False
|
inv_tmp
|
float
|
Inverse temperature. Default is 1.0. This is used for the categorical and Bernoulli distributions. |
1.0
|
Returns:
| Type | Description |
|---|---|
StochState
|
Posterior distribution. |
Source code in src/ml_networks/torch/distributions.py
normal ¶
Source code in src/ml_networks/torch/distributions.py
NormalStoch
dataclass
¶
Parameters of a normal distribution and its stochastic sample.
Attributes:
| Name | Type | Description |
|---|---|---|
mean |
Tensor
|
Mean of the normal distribution. |
std |
Tensor
|
Standard deviation of the normal distribution. |
stoch |
Tensor
|
sample from the normal distribution with reparametrization trick. |
Attributes¶
Functions¶
__getattr__ ¶
torch.Tensor に含まれるメソッドを呼び出したら、各メンバに適用する.
例: normal.flatten() → NormalStoch(mean.flatten(), std.flatten(), stoch.flatten()).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
メソッド名。 |
required |
Returns:
| Type | Description |
|---|---|
callable
|
torch.Tensorのメソッドを各メンバに適用する関数。 |
Raises:
| Type | Description |
|---|---|
AttributeError
|
指定された名前がtorch.Tensorのメソッドでない場合。 |
Source code in src/ml_networks/torch/distributions.py
__getitem__ ¶
インデックスアクセス.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
idx
|
int or slice or tuple
|
インデックス指定。 |
required |
Returns:
| Type | Description |
|---|---|
NormalStoch
|
指定されたインデックスに対応する |
Source code in src/ml_networks/torch/distributions.py
__len__ ¶
__post_init__ ¶
初期化後の処理.
Raises:
| Type | Description |
|---|---|
ValueError
|
|
Source code in src/ml_networks/torch/distributions.py
get_distribution ¶
save ¶
Save the parameters of the normal distribution to the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to save the parameters. |
required |
Source code in src/ml_networks/torch/distributions.py
squeeze ¶
Squeeze the parameters of the normal distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Dimension to squeeze. |
required |
Returns:
| Type | Description |
|---|---|
NormalStoch
|
Squeezed normal distribution. |
Source code in src/ml_networks/torch/distributions.py
unsqueeze ¶
Unsqueeze the parameters of the normal distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Dimension to unsqueeze. |
required |
Returns:
| Type | Description |
|---|---|
NormalStoch
|
Unsqueezed normal distribution. |
Source code in src/ml_networks/torch/distributions.py
CategoricalStoch
dataclass
¶
Parameters of a categorical distribution and its stochastic sample.
Attributes:
| Name | Type | Description |
|---|---|---|
logits |
Tensor
|
Logits of the categorical distribution. |
probs |
Tensor
|
Probabilities of the categorical distribution. |
stoch |
Tensor
|
sample from the categorical distribution with Straight-Through Estimator. |
Attributes¶
Functions¶
__getattr__ ¶
torch.Tensor に含まれるメソッドを呼び出したら、各メンバに適用する.
例: normal.flatten() → NormalStoch(mean.flatten(), std.flatten(), stoch.flatten()).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
メソッド名。 |
required |
Returns:
| Type | Description |
|---|---|
callable
|
torch.Tensorのメソッドを各メンバに適用する関数。 |
Raises:
| Type | Description |
|---|---|
AttributeError
|
指定された名前がtorch.Tensorのメソッドでない場合。 |
Source code in src/ml_networks/torch/distributions.py
__getitem__ ¶
インデックスアクセス.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
idx
|
int or slice or tuple
|
インデックス指定。 |
required |
Returns:
| Type | Description |
|---|---|
CategoricalStoch
|
指定されたインデックスに対応する |
Source code in src/ml_networks/torch/distributions.py
__len__ ¶
__post_init__ ¶
初期化後の処理.
Raises:
| Type | Description |
|---|---|
ValueError
|
|
Source code in src/ml_networks/torch/distributions.py
get_distribution ¶
save ¶
Save the parameters of the categorical distribution to the specified path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Path to save the parameters. |
required |
Source code in src/ml_networks/torch/distributions.py
squeeze ¶
Squeeze the parameters of the categorical distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Dimension to squeeze. |
required |
Returns:
| Type | Description |
|---|---|
CategoricalStoch
|
Squeezed categorical distribution. |
Source code in src/ml_networks/torch/distributions.py
unsqueeze ¶
Unsqueeze the parameters of the categorical distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dim
|
int
|
Dimension to unsqueeze. |
required |
Returns:
| Type | Description |
|---|---|
CategoricalStoch
|
Unsqueezed categorical distribution. |
Source code in src/ml_networks/torch/distributions.py
損失関数¶
focal_loss ¶
Focal loss function. Mainly for multi-class classification.
Reference
Focal Loss for Dense Object Detection https://arxiv.org/abs/1708.02002
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction
|
Tensor
|
The predicted tensor. This should be before softmax. |
required |
target
|
Tensor
|
The target tensor. |
required |
gamma
|
float
|
The gamma parameter. Default is 2.0. |
2.0
|
sum_dim
|
int
|
The dimension to sum the loss. Default is -1. |
-1
|
Returns:
| Type | Description |
|---|---|
Tensor
|
The focal loss. |
Source code in src/ml_networks/torch/loss.py
charbonnier ¶
Charbonnier loss function.
Reference
A General and Adaptive Robust Loss Function http://arxiv.org/abs/1701.03077
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prediction
|
Tensor
|
The predicted tensor. |
required |
target
|
Tensor
|
The target tensor. |
required |
epsilon
|
float
|
A small value to avoid division by zero. Default is 1e-3. |
0.001
|
alpha
|
float
|
The alpha parameter. Default is 1. |
1
|
sum_dim
|
int | list[int] | tuple[int, ...] | None
|
The dimension to sum the loss. Default is None (sums over [-1, -2, -3]). |
None
|
Returns:
| Type | Description |
|---|---|
Tensor
|
The Charbonnier loss. |
Source code in src/ml_networks/torch/loss.py
FocalFrequencyLoss ¶
FocalFrequencyLoss(loss_weight=1.0, alpha=1.0, patch_factor=1, ave_spectrum=False, log_matrix=False, batch_matrix=False)
The torch.nn.Module class that implements focal frequency loss.
A frequency domain loss function for optimizing generative models.
Reference
Focal Frequency Loss for Image Reconstruction and Synthesis. In ICCV 2021. https://arxiv.org/pdf/2012.12821.pdf
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loss_weight
|
float
|
weight for focal frequency loss. Default: 1.0 |
1.0
|
alpha
|
float
|
the scaling factor alpha of the spectrum weight matrix for flexibility. Default: 1.0 |
1.0
|
patch_factor
|
int
|
the factor to crop image patches for patch-based focal frequency loss. Default: 1 |
1
|
ave_spectrum
|
bool
|
whether to use minibatch average spectrum. Default: False |
False
|
log_matrix
|
bool
|
whether to adjust the spectrum weight matrix by logarithm. Default: False |
False
|
batch_matrix
|
bool
|
whether to calculate the spectrum weight matrix using batch-based statistics. Default: False |
False
|
Source code in src/ml_networks/torch/loss.py
Attributes¶
Functions¶
__call__ ¶
Forward function to calculate focal frequency loss.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pred
|
Tensor
|
of shape (N, C, H, W). Predicted tensor. |
required |
target
|
Tensor
|
of shape (N, C, H, W). Target tensor. |
required |
matrix
|
Tensor | None
|
Default: None (If set to None: calculated online, dynamic). |
None
|
mean_batch
|
bool
|
Whether to average over batch dimension. |
True
|
Returns:
| Type | Description |
|---|---|
Tensor
|
The focal frequency loss. |
Source code in src/ml_networks/torch/loss.py
loss_formulation ¶
Source code in src/ml_networks/torch/loss.py
tensor2freq ¶
Source code in src/ml_networks/torch/loss.py
ユーティリティ¶
get_optimizer ¶
Get optimizer from torch.optim or pytorch_optimizer.
Args:
param : Iterator[nn.Parameter] Parameters of models to optimize. name : str Optimizer name. kwargs : dict Optimizer arguments(settings).
Returns:
| Type | Description |
|---|---|
Optimizer
|
|
Examples:
>>> get_optimizer([nn.Parameter(torch.randn(1, 3))], "Adam", lr=0.01)
Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
capturable: False
differentiable: False
eps: 1e-08
foreach: None
fused: None
lr: 0.01
maximize: False
weight_decay: 0
)
Source code in src/ml_networks/torch/torch_utils.py
torch_fix_seed ¶
乱数を固定する関数.
References
- https://qiita.com/north_redwing/items/1e153139125d37829d2d
Source code in src/ml_networks/torch/torch_utils.py
save_blosc2 ¶
Save numpy array with blosc2 compression.
Args:
path : str Path to save. x : np.ndarray Numpy array to save.
Examples:
Source code in src/ml_networks/utils.py
load_blosc2 ¶
Load numpy array with blosc2 compression.
Args:
path : str Path to load.
Returns:
| Type | Description |
|---|---|
ndarray
|
Numpy array. |
Examples: