Augmentations
Subpackages
Transforms
- class Downscale(scale_min: float = 0.25, scale_max: float = 0.25, interpolation: int | Interpolation | Dict[str, int] | None = None, always_apply: bool = False, p: float = 0.5)[source]
Bases:
ImageOnlyTransform
Decreases image quality by downscaling and upscaling back.
- Parameters:
scale_min (float) – lower bound on the image scale. Should be < 1.
scale_max (float) – upper bound on the image scale. Should be < 1.
interpolation (int, dict, Interpolation) –
scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Could be:
Single Scipy interpolation flag: The selected method will be used for both downscale and upscale.
dict of flags: Dictionary with keys ‘downscale’ and ‘upscale’ specifying the interpolation flags for each operation.
Interpolation object: Downscale.Interpolation object with flags for both downscale and upscale.
Default: Interpolation(downscale=dicaugment.INTER_NEAREST, upscale=dicaugment.INTER_NEAREST)
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, uint16, int16, int32, float32
- class Equalize(range: int | Tuple[int, int] | None = None, mask: ndarray | callable | None = None, mask_params: Sequence[str] = (), always_apply: bool = False, p: float = 0.5)[source]
Bases:
ImageOnlyTransform
Equalize the image histogram. For multi-channel images, each channel is processed individually
- Parameters:
range (int, list of int) – Histogram range. If int, then range is defined as [0, range]. If None, the range is calculated as [0, max(img)]. Default: None
mask (np.ndarray, callable) – If given, only the pixels selected by the mask are included in the analysis. Function signature must include image argument.
mask_params (list of str) – Params for mask function.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, uint16, int16
- apply(image: ndarray, mask: None | ndarray = None, **params) ndarray [source]
Applies the transformation to the image
- get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any] [source]
Returns additional parameters needed for the apply methods that depend on a target (e.g. apply_to_bboxes method expects image size)
- get_transform_init_args_names() Tuple[str, ...] [source]
Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))
- property targets_as_params: List[str]
Returns a list of target names (e.g. ‘image’) that are needed as a parameter input to other apply methods (e.g. apply_to_bboxes(…, image = image))
- class FromFloat(dtype: str = 'int16', min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]
Bases:
ImageOnlyTransform
Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.
This is the inverse transform for
ToFloat
.- Parameters:
min_value (float) – minimum possible input value. Default: None.
max_value (float) – maximum possible input value. Default: None.
dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘int16’.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 1.0.
- Targets:
image
- Image types:
float32
- class GaussNoise(var_limit: float | Tuple[float, float] = (10.0, 50.0), mean: float = 0, apply_to_channel_idx: int | None = None, per_channel: bool = True, always_apply: bool = False, p: float = 0.5)[source]
Bases:
ImageOnlyTransform
Apply gaussian noise to the input image.
- Parameters:
var_limit ((float, float) or float) – variance range for noise. If var_limit is a single float, the range will be (0, var_limit). Default: (10.0, 50.0).
mean (float) – mean of the noise. Default: 0
apply_to_channel_idx (int, None) – If not None, then only only noise is applied on the specified channel index. Default: None
per_channel (bool) – if set to True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Ignored if apply_to_channel_idx is not None. Default: True
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, uint16, int16, float32
- apply(img: ndarray, gauss: None | ndarray = None, **params) ndarray [source]
Applies the transformation to the image
- get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any] [source]
Returns additional parameters needed for the apply methods that depend on a target (e.g. apply_to_bboxes method expects image size)
- get_transform_init_args_names() Tuple[str, ...] [source]
Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))
- property targets_as_params: List[str]
Returns a list of target names (e.g. ‘image’) that are needed as a parameter input to other apply methods (e.g. apply_to_bboxes(…, image = image))
- class InvertImg(always_apply: bool = False, p: float = 0.5)[source]
Bases:
ImageOnlyTransform
Invert the input image by subtracting pixel values from the maximum value for the input image dtype.
- Parameters:
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, uint16, int16, float32
- class Normalize(mean: None | float | Tuple[float] = None, std: None | float | Tuple[float] = None, always_apply: bool = False, p: float = 1.0)[source]
Bases:
ImageOnlyTransform
Normalization is applied by the formula: img = (img - mean) / (std)
- Parameters:
mean (None, float, list of float) – mean values along channel dimension. If None, mean is calculated per image at runtime.
std (None, float, list of float) – std values along channel dimension. If None, std is calculated per image at runtime.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, float32
- class PixelDropout(dropout_prob: float = 0.01, per_channel: bool = False, drop_value: float | Sequence[float] | None = 0, mask_drop_value: float | Sequence[float] | None = None, always_apply: bool = False, p: float = 0.5)[source]
Bases:
DualTransform
Set pixels to 0 with some probability.
- Parameters:
dropout_prob (float) – pixel drop probability. Default: 0.01
per_channel (bool) – if set to True drop mask will be sampled fo each channel, otherwise the same mask will be sampled for all channels. Default: False
drop_value (number or sequence of numbers or None) – Value that will be set in dropped place. If set to None value will be sampled randomly, default ranges will be used: - uint8: [0, 255] - uint16: [0, 65535] - uint32: [0, 4294967295] - int16 - [-32768, 32767] - int32 - [-2147483648, 2147483647] - float, double - [0, 1] Default: 0
mask_drop_value (number or sequence of numbers or None) – Value that will be set in dropped place in masks. If set to None masks will be unchanged. Default: 0
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image, mask
- Image types:
any
- apply(img: ndarray, drop_mask: ndarray = array(None, dtype=object), drop_value: float | Sequence[float] = (), **params) ndarray [source]
Applies the transformation to the image
- apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float] [source]
Applies the augmentation to a bbox
- apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float] [source]
Applies the augmentation to a keypoint
- apply_to_mask(img: ndarray, drop_mask: ndarray = array(None, dtype=object), **params) ndarray [source]
Applies the augmentation to a mask and forces INTER_NEAREST interpolation
- get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any] [source]
Returns additional parameters needed for the apply methods that depend on a target (e.g. apply_to_bboxes method expects image size)
- get_transform_init_args_names() Tuple[str, ...] [source]
Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))
- property targets_as_params: List[str]
Returns a list of target names (e.g. ‘image’) that are needed as a parameter input to other apply methods (e.g. apply_to_bboxes(…, image = image))
- class Posterize(num_bits=8, always_apply=False, p=0.5)[source]
Bases:
ImageOnlyTransform
Reduce the number of bits for each color channel.
- Parameters:
num_bits ((int, int) or int, or list of ints [r, g, b], or list of ints [[r1, r2], [g1, g2], [b1, b2]]) – number of high bits. If num_bits is a single value, the range will be [num_bits, num_bits]. Must be in range [0, n] where n is the number of bits in the image dtype . Default: 8.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
Targets: image
- Image types:
uint8, uint16, int16, int32
- class RandomBrightnessContrast(max_brightness: int | float | None = None, brightness_limit: float | Tuple[float, float] = 0.2, contrast_limit: float | Tuple[float, float] = 0.2, always_apply: bool = False, p: bool = 0.5)[source]
Bases:
ImageOnlyTransform
Randomly change brightness and contrast of the input image.
- Parameters:
max_brightness (int,float,None) – If not None, adjust contrast by specified maximum and clip to maximum, else adjust contrast by image mean. Default: None
brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).
contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: (-0.2, 0.2).
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, uint16, int16, float32
- class RandomGamma(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]
Bases:
ImageOnlyTransform
- Parameters:
gamma_limit (float or (float, float)) – If gamma_limit is a single float value, the range will be (-gamma_limit, gamma_limit). Default: (80, 120).
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- Image types:
uint8, float32
- class Sharpen(alpha: Tuple[float, float] | float = (0.2, 0.5), lightness: Tuple[float, float] | float = (0.5, 1.0), mode: str = 'constant', cval: float | int = 0, always_apply=False, p=0.5)[source]
Bases:
ImageOnlyTransform
Sharpen the input image and overlays the result with the original image.
- Parameters:
alpha ((float, float)) – range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
lightness ((float, float)) – range to choose the lightness of the sharpened image. Default: (0.5, 1.0).
mode (str) –
scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:
reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.
mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.
Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html
Default: constant
cval (int,float) – The fill value when mode = constant. Default: 0
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Targets:
image
- class ToFloat(min_value: float | None = None, max_value: float | None = None, always_apply=False, p=1.0)[source]
Bases:
ImageOnlyTransform
Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.
See also
FromFloat
- Parameters:
min_value (float) – minimum possible input value. Default: None.
max_value (float) – maximum possible input value. Default: None.
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 1.0.
- Targets:
image
- Image types:
any type
- class UnsharpMask(blur_limit: int | Sequence[int] = (3, 7), sigma_limit: float | Sequence[float] = 0.0, alpha: float | Sequence[float] = (0.2, 0.5), threshold: float = 0.05, mode: str = 'constant', cval: int | float = 0, always_apply: bool = False, p: float = 0.5)[source]
Bases:
ImageOnlyTransform
Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.
- Parameters:
blur_limit (int, (int, int)) – maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as round(sigma * 4 * 2) + 1. If set single value blur_limit will be in range (0, blur_limit). Default: (3, 7).
sigma_limit (float, (float, float)) – Gaussian kernel standard deviation. Must be in range [0, inf). If set single value sigma_limit will be in range (0, sigma_limit). If set to 0 sigma will be computed as sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8. Default: 0.
alpha (float, (float, float)) – range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).
threshold (float) – Value to limit sharpening only for areas with high pixel difference between original image and it’s smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 1]. Default: 0.05.
mode (str) –
scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:
reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.
mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.
Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html
Default: constant
cval (int,float) – The fill value when mode = constant. Default: 0
always_apply (bool) – whether to always apply the transformation. Default: False
p (float) – probability of applying the transform. Default: 0.5.
- Reference:
- Targets:
image
Functional
- brightness_contrast_adjust(img: ndarray, alpha: float | int = 1, beta: float | int = 0, max_brightness: int | float | None = None) ndarray [source]
Adjusts the brightness and/or contrast of an image
- Parameters:
img (np.ndarray) – an image
alpha (int,float) – The contrast parameter
beta (int,float) – The brightness parameter
max_brightness (int,float,None) – If not None, adjust contrast by specified maximum and clip to maximum, else adjust contrast by image mean. Default: None
- convolve(img: ndarray, kernel: ndarray, mode: str = 'constant', cval: int | float = 0) ndarray [source]
Applies a convolutional kernel to an image
- Parameters:
img (np.ndarray) – an image
kernel (np.ndarray) – a kernel to convolve over image
mode (str) –
scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:
reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.
mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.
Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html
Default: constant
cval (int,float) – The fill value when mode = constant. Default: 0
- Returns:
the convolved image
- Return type:
np.ndarray
- downscale(img: ndarray, scale: float, down_interpolation: int = 1, up_interpolation: int = 1) ndarray [source]
Decreases image quality by downscaling and upscaling back.
- Parameters:
img (np.ndarray) – an image
scale (float) – the scale to downsize to
down_interpolation (int, Interpolation) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST)
up_interpolation (int, Interpolation) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST)
- equalize(img, hist_range=None, mask=None)[source]
Equalize the image histogram.
- Parameters:
img (numpy.ndarray) – image.
hist_range (tuple) – The histogram range
mask (numpy.ndarray) – An optional mask. If given, only the pixels selected by the mask are included in the analysis. Maybe 1 channel or 3 channel array.
- Returns:
Equalized image.
- Return type:
numpy.ndarray
- from_float(img: ndarray, dtype: str, min_value: int | float | None = None, max_value: int | float | None = None) ndarray [source]
Convert an image from a floating point image, to the specified dtype
- Parameters:
img (np.ndarray) – an image
dtype (str) – a dtype to cast to. Must be one of {uint8, uint16, uint32, float32, int16, int32, float64}
min_value (int,float,None) – Optional custom minimum value of dtype. Maps lower bound of float32 (0.0) to this value.
max_value (int,float,None) – Optional custom maximum value of dtype. Maps upper bound of float32 (1.0) to this value.
- Returns:
image cast to dtype
- Return type:
np.ndarray
- Raises:
RuntimeError – if dtype is not one of {uint8, uint16, uint32, float32, int16, int32, float64}
- gamma_transform(img: ndarray, gamma: float) ndarray [source]
Performs a Gamma correction on an image.
- Parameters:
img (np.ndarray) – an image
gamma (float) – gamma parameter
- gauss_noise(image, gauss)[source]
Adds noise to an image.
- Parameters:
img (np.ndarray) – an image
guass (np.ndarray) – guassian noise parameter
- invert(img: ndarray) ndarray [source]
Inverts the pixel values of an image.
- Parameters:
img (np.ndarray) – an image
- multiply(img, multiplier)[source]
- Parameters:
img (numpy.ndarray) – Image.
multiplier (numpy.ndarray) – Multiplier coefficient.
- Returns:
Image multiplied by multiplier coefficient.
- Return type:
numpy.ndarray
- normalize(img: ndarray, mean: float | ndarray | None, std: float | ndarray | None) ndarray [source]
Normalizes an image by the formula img = (img - mean) / (std).
- Parameters:
img (np.ndarray) – an image
mean (float, np.ndarray, None) – The offset for the image. If None, mean is calculated as the mean of the image. If np.ndarray, operation can be broadcast across dimensions.
std (float, np.ndarray, None) – The standard deviation to divide the image by. If None, std is calculated as the std of the image. If np.ndarray, operation can be broadcast across dimensions.
- to_float(img, min_value=None, max_value=None)[source]
Convert an image to a floating point image based on current dtype
- Parameters:
img (np.ndarray) – an image
min_value (int,float,None) – Optional custom minimum value of dtype. Maps this value to the lower bound of float32 (0.0).
max_value (int,float,None) – Optional custom maximum value of dtype. Maps this value to the upper bound of float32 (1.0).
- Returns:
image cast to float32
- Return type:
np.ndarray
- Raises:
RuntimeError – if image dtype is not one of {uint8, uint16, uint32, float32, int16, int32, float64}
- unsharp_mask(image: ndarray, ksize: int, sigma: float = 0.0, alpha: float = 0.2, threshold: float = 0.05, mode: str = 'constant', cval: float | int = 0)[source]
Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.
- Parameters:
image (np.ndarray) – an image
ksize (int) – The size of the Guassian Kernel. If 0, then ksize is estimated as round(sigma * 8) + 1
sigma (float) – Gaussian kernel standard deviation. If 0, then sigma is estimated as 0.3 * ((ksize - 1) * 0.5 - 1) + 0.8
alpha (float) – visibility of sharpened image
threshold (float) – Value to limit sharpening only for areas with high pixel difference between original image
mode (str) –
scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:
reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.
constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.
nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.
mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.
wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.
Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html
Default: constant
cval (int,float) – The fill value when mode = constant. Default: 0
- Reference:
Utils
- angle_2pi_range(func: Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]]) Callable[[Concatenate[Tuple[float, float, float, float], P]], Tuple[float, float, float, float]] [source]
Decorator method that keeps keypoints angles in the range of [0, 2pi]
- clip(img: ndarray, dtype: dtype, minval: float, maxval: float) ndarray [source]
Clips an image by a minimum and maximum value, then casts to dtype
- clipped(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray] [source]
Decorator method that clips an image to it’s specified dtype minimums and maximums
- ensure_contiguous(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray] [source]
Decorator that ensures input img is contiguous.
- is_grayscale_image(image: ndarray) bool [source]
Returns whether image fits the criteria for a volumetric grayscale image
- is_multispectral_image(image: ndarray) bool [source]
Returns whether image fits the criteria for a volumetric multispectral image
- is_rgb_image(image: ndarray) bool [source]
Returns whether image fits the criteria for a volumetric rgb image
- preserve_channel_dim(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray] [source]
Decorator that preserves a dummy channel dim.
- preserve_shape(func: Callable[[Concatenate[ndarray, P]], ndarray]) Callable[[Concatenate[ndarray, P]], ndarray] [source]
Decorators that preserves shape of the image
- read_dcm_image(path: str, include_header: bool = True, ends_with: str = '')[source]
Reads in an alphabetically sorted series of dcm file types stored in a directory as a np.ndarray and optionally a dicom header in a dict format.
- Parameters:
path (str) – The filepath to the directory that stores the dcm files.
include_header (bool) – Whether to return the dicom header metadata associated with the scan. Default: True
ends_with (str) – If empty string, then all files in directory will be processed. If multiple file types are within the directory, you may filter the results by setting ends_with=”.dcm” Default: “”
Note
- DICOM object types are dictionaries with the following keys:
- PixelSpaxing (tuple)
The space in mm between pixels for both height and width of a slice, respectively
- RescaleIntercept (float)
The value to add to each pixel of the scan after scaling with RescaleSlope to turn the pixel values of the scan into Hounsfield Units (HU)
- RescaleSlope (float)
The value to multiply each pixel of the scan by before adding RescaleIntercept to turn the pixel values of the scan into Hounsfield Units (HU)
- ConvolutionKernel (str)
A label describing the convolution kernel or algorithm used to reconstruct the data
- XRayTubeCurrent (int)
X-Ray Tube Current in mA.
See example below:
dicom = { "PixelSpacing" : (0.5, 0.5), "RescaleIntercept" : -1024.0, "RescaleSlope" : 1.0, "ConvolutionKernel" : 'STANDARD', "XRayTubeCurrent" : 160 }