Geometric

Transforms

class Flip(always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Flip the input either horizontally, vertically, along the z-axis, or all.

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, d: int = 0, **params) ndarray[source]

Applies the transformation to the image

Parameters:
  • img (np.ndarray) – an image

  • d (int) – code that specifies how to flip the input. 0 for vertical flipping, 1 for horizontal flipping, 2 for z-axis flip, or -1 for vertical, horizontal, and z-axis flipping

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_params()[source]

Returns parameters needed for the apply methods

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class HorizontalFlip(always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Flip the input horizontally around the y-axis.

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class PadIfNeeded(min_height: int | None = 1024, min_width: int | None = 1024, min_depth: int | None = 1024, pad_height_divisor: int | None = None, pad_width_divisor: int | None = None, pad_depth_divisor: int | None = None, position: PositionType | str = PositionType.CENTER, border_mode: int = 'constant', value: float | int = 0, mask_value: float | int = 0, always_apply: bool = False, p: float = 1.0)[source]

Bases: DualTransform

Pad side of the image / max if side is less than desired number.

Parameters:
  • min_height (int) – minimal result image height.

  • min_width (int) – minimal result image width.

  • pad_height_divisor (int) – if not None, ensures image height is dividable by value of this argument.

  • pad_width_divisor (int) – if not None, ensures image width is dividable by value of this argument.

  • position (Union[str, PositionType]) – Position of the image. should be PositionType.CENTER or PositionType.TOP_LEFT or PositionType.TOP_RIGHT or PositionType.BOTTOM_LEFT or PositionType.BOTTOM_RIGHT. or PositionType.RANDOM. Default: PositionType.CENTER.

  • border_mode (OpenCV flag) – OpenCV border mode.

  • value (int, float, list of int, list of float) – padding value if border_mode is “constant”.

  • mask_value (int, float, list of int, list of float) – padding value for mask if border_mode is “constant”.

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 1.0.

Targets:

image, mask, bbox, keypoints

Image types:

uint8, float32

class PositionType(value)[source]

Bases: Enum

An enumeration.

BACK_BOTTOM_LEFT = 'back_bottom_left'
BACK_BOTTOM_RIGHT = 'back_bottom_right'
BACK_TOP_LEFT = 'back_top_left'
BACK_TOP_RIGHT = 'back_top_right'
CENTER = 'center'
FRONT_BOTTOM_LEFT = 'front_bottom_left'
FRONT_BOTTOM_RIGHT = 'front_bottom_right'
FRONT_TOP_LEFT = 'front_top_left'
FRONT_TOP_RIGHT = 'front_top_right'
RANDOM = 'random'
apply(img: ndarray, pad_top: int = 0, pad_bottom: int = 0, pad_left: int = 0, pad_right: int = 0, pad_front: int = 0, pad_back: int = 0, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], pad_top: int = 0, pad_bottom: int = 0, pad_left: int = 0, pad_right: int = 0, pad_front: int = 0, pad_back: int = 0, rows: int = 0, cols: int = 0, slices: int = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], pad_top: int = 0, pad_bottom: int = 0, pad_left: int = 0, pad_right: int = 0, pad_front: int = 0, pad_back: int = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

apply_to_mask(img: ndarray, pad_top: int = 0, pad_bottom: int = 0, pad_left: int = 0, pad_right: int = 0, pad_front: int = 0, pad_back: int = 0, **params) ndarray[source]

Applies the transformation to a mask

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

update_params(params, **kwargs)[source]

Adds additional parameters that are defined at a per instance level

Parameters:
  • params (dict) – keys-value pairs of argument names for the augmentation’s apply methods

  • kwargs (dict) – keyword arguments of targets (e.g. ‘image’, ‘bboxes’)

class ShiftScaleRotate(shift_limit: float = 0.0625, scale_limit: float = 0.1, rotate_limit: int | float = 45, axes: str = 'xy', interpolation: int = 1, border_mode: str = 'constant', crop_to_border: bool = False, value: int | float = 0, mask_value: int | float = 0, shift_limit_x: Tuple[float, float] | float | None = None, shift_limit_y: Tuple[float, float] | float | None = None, shift_limit_z: Tuple[float, float] | float | None = None, rotate_method: str = 'largest_box', always_apply=False, p=0.5)[source]

Bases: DualTransform

Randomly apply affine transforms: translate, scale and rotate the input.

Parameters:
  • shift_limit ((float, float) or float) – shift factor range for height, width, and depth. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [0, 1]. Default: (-0.0625, 0.0625).

  • scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).

  • rotate_limit ((int, int) or int) – rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).

  • axes (str, list of str) – Defines the axis of rotation. Must be one of {‘xy’,’yz’,’xz’} or a list of them. If a single str is passed, then all rotations will occur on that axis If a list is passed, then one axis of rotation will be chosen at random for each call of the transformation. Default: “xy”

  • interpolation – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). default: dicaugment.INTER_LINEAR

  • border_mode (str) –

    Scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    See https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html Default: constant.

  • value (int or float) – padding value if border_mode is “constant”.

  • mask_value (int or float) – padding value if border_mode is “constant” applied for masks.

  • crop_to_border (bool) – If True, then the image is padded or cropped to fit the entire rotation. If False, then original image shape is maintained and some portions of the image may be cropped away. Note that any translations are applied after the image is reshaped. Default: False

  • shift_limit_x ((float, float) or float) – shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width. If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [0, 1]. Default: None.

  • shift_limit_y ((float, float) or float) – shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height. If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [0, 1]. Default: None.

  • shift_limit_z ((float, float) or float) – shift factor range for depth. If it is set then this value instead of shift_limit will be used for shifting depth. If shift_limit_z is a single float value, the range will be (-shift_limit_z, shift_limit_z). Absolute values for lower and upper bounds should lie in the range [0, 1]. Default: None.

  • rotate_method (str) – rotation method used for the bounding boxes. Should be one of “largest_box” or “ellipse”. Default: “largest_box”

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, angle: float = 0, axes: str = 'xy', scale: float = 0, dx: float = 0, dy: float = 0, dz: float = 0, interpolation: int = 1, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], angle: float = 0, axes: str = 'xy', scale: float = 0, dx: float = 0, dy: float = 0, dz: float = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_dicom(dicom: Dict[str, Any], scale: float = 1, **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint: Tuple[float, float, float, float], angle: float = 0, axes: str = 'xy', scale: float = 0, dx: float = 0, dy: float = 0, dz: float = 0, rows: int = 0, cols: int = 0, slices: int = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

apply_to_mask(img: ndarray, angle: float = 0, axes: str = 'xy', scale: float = 0, dx: float = 0, dy: float = 0, dz: float = 0, **params) ndarray[source]

Applies the transformation to a mask and forces INTER_NEAREST interpolation

get_params() Dict[str, Any][source]

Returns parameters needed for the apply methods

get_transform_init_args() Dict[str, Any][source]

Returns initialization arguments (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’ : 1, ‘arg2’: 2))

class SliceFlip(always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Flip the input along the slice dimension.

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class Transpose(always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Transpose the input by swapping rows and columns. Slice dimension remains unaffected

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_dicom(dicom: Dict[str, Any], **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class VerticalFlip(always_apply: bool = False, p: float = 0.5)[source]

Bases: DualTransform

Flip the input vertically around the x-axis.

Parameters:

p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, float32

apply(img: ndarray, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_transform_init_args_names()[source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class RandomRotate90(axes: str = 'xy', always_apply=False, p=0.5)[source]

Bases: DualTransform

Randomly rotate the input by 90 degrees zero or more times.

Parameters:
  • axes (str, list of str) – Defines the axis of rotation. Must be one of {‘xy’,’yz’,’xz’} or a list of them. If a single str is passed, then all rotations will occur on that axis If a list is passed, then one axis of rotation will be chosen at random for each call of the transformation

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, factor: int = 0, axes: str = 'xy', **params) ndarray[source]

Applies the transformation to the image

Parameters:
  • img (np.ndarray) – an image

  • factor (int) – number of times the input will be rotated by 90 degrees.

  • axes (str) – the axes to rotate along

apply_to_bbox(bbox: Tuple[float, float, float, float], factor: int = 0, axes: str = 'xy', **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], factor: int = 0, axes: str = 'xy', **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_params() Dict[source]

Returns parameters needed for the apply methods

get_transform_init_args_names() Tuple[str, ...][source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class Rotate(limit: float = 90, axes: str = 'xy', interpolation: int = 1, border_mode: str = 'constant', value: int | float = 0, mask_value: int | float = 0, rotate_method: str = 'largest_box', crop_to_border: bool = False, always_apply=False, p=0.5)[source]

Bases: DualTransform

Rotate the input by an angle selected randomly from the uniform distribution.

Parameters:
  • limit ((int, int) or int) – range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: (-90, 90)

  • axes (str, list of str) – Defines the axis of rotation. Must be one of {‘xy’,’yz’,’xz’} or a list of them. If a single str is passed, then all rotations will occur on that axis If a list is passed, then one axis of rotation will be chosen at random for each call of the transformation

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

  • mode (str) –

    scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

    Default: constant

  • value (int or float) – The fill value when border_mode = constant. Default: 0.

  • mask_value (int, float, list of ints, list of float) – The fill value when border_mode = constant applied for masks. Default: 0.

  • rotate_method (str) – rotation method used for the bounding boxes. Should be one of “largest_box” or “ellipse”. Default: “largest_box”

  • crop_to_border (bool) – If True, then the image is cropped to fit the entire rotation. If False, then original image shape is maintained and some portions of the image may be cropped away. Default: False

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, angle: float = 0, axes: str = 'xy', **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], angle: float = 0, axes: str = 'xy', cols: int = 0, rows: int = 0, slices: int = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox

apply_to_keypoint(keypoint: Tuple[float, float, float, float], angle: float = 0, axes: str = 'xy', cols: int = 0, rows: int = 0, slices: int = 0, **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

apply_to_mask(img: ndarray, angle: float = 0, axes: str = 'xy', **params) ndarray[source]

Applies the transformation to a mask and forces INTER_NEAREST interpolation

get_params_dependent_on_targets(params: Dict[str, Any]) Dict[str, Any][source]

Returns additional parameters needed for the apply methods that depend on a target (e.g. apply_to_bboxes method expects image size)

get_transform_init_args_names() Tuple[str, ...][source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

property targets_as_params: List[str]

Returns a list of target names (e.g. ‘image’) that are needed as a parameter input to other apply methods (e.g. apply_to_bboxes(…, image = image))

class LongestMaxSize(max_size: int | Sequence[int] = 1024, interpolation: int = 1, always_apply: bool = False, p: float = 1)[source]

Bases: DualTransform

Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:
  • max_size (int, list of int) – maximum size of the image after the transformation. When using a list, max size will be randomly selected from the values in the list.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, max_size: int = 1024, interpolation: int = 1, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox. Bounding box coordinates are scale invariant

apply_to_dicom(dicom: Dict[str, Any], max_size: int = 1024, **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint: Tuple[float, float, float, float], max_size: int = 1024, **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_params() Dict[str, int][source]

Returns parameters needed for the apply methods

get_transform_init_args_names() Tuple[str, ...][source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class RandomScale(scale_limit: float = 0.1, interpolation: int = 1, always_apply=False, p=0.5)[source]

Bases: DualTransform

Randomly resize the input. Output image size is different from the input image size.

Parameters:
  • scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 0.5.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img, scale=0, interpolation=1, **params)[source]

Applies the transformation to the image

apply_to_bbox(bbox, **params)[source]

Applies the transformation to a bbox. Bounding box coordinates are scale invariant

apply_to_dicom(dicom: Dict[str, Any], scale=1, **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint, scale=1, **params)[source]

Applies the transformation to a keypoint

get_params()[source]

Returns parameters needed for the apply methods

get_transform_init_args()[source]

Returns initialization arguments (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’ : 1, ‘arg2’: 2))

class Resize(height: int, width: int, depth: int, interpolation: int = 1, always_apply=False, p=1)[source]

Bases: DualTransform

Resize the input to the given height, width, depth.

Parameters:
  • height (int) – desired height of the output.

  • width (int) – desired width of the output.

  • depth (int) – desired depth of the output.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, interpolation: int = 1, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox. Bounding box coordinates are scale invariant

apply_to_dicom(dicom: Dict[str, Any], **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_transform_init_args_names() Tuple[str, ...][source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

class SmallestMaxSize(max_size: int | Sequence[int] = 1024, interpolation: int = 1, always_apply: bool = False, p: float = 1)[source]

Bases: DualTransform

Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:
  • max_size (int, list of int) – maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

  • always_apply (bool) – whether to always apply the transformation. Default: False

  • p (float) – probability of applying the transform. Default: 1.

Targets:

image, mask, bboxes, keypoints

Image types:

uint8, uint16, int16, float32

apply(img: ndarray, max_size: int = 1024, interpolation: int = 1, **params) ndarray[source]

Applies the transformation to the image

apply_to_bbox(bbox: Tuple[float, float, float, float], **params) Tuple[float, float, float, float][source]

Applies the transformation to a bbox. Bounding box coordinates are scale invariant

apply_to_dicom(dicom: Dict[str, Any], max_size: int = 1024, **params) Dict[str, Any][source]

Applies the augmentation to a dicom type

apply_to_keypoint(keypoint: Tuple[float, float, float, float], max_size: int = 1024, **params) Tuple[float, float, float, float][source]

Applies the transformation to a keypoint

get_params() Dict[str, Any][source]

Returns parameters needed for the apply methods

get_transform_init_args_names() Tuple[str, ...][source]

Returns initialization argument names. (e.g. Transform(arg1 = 1, arg2 = 2) -> (‘arg1’, ‘arg2’))

Functional

bbox_flip(bbox: Tuple[float, float, float, float], d: int, rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a bounding box either vertically, horizontally, along the slice axis, or all depending on the value of d.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • d – dimension. 0 for vertical flip, 1 for horizontal, 2 for z-axis, -1 for transpose

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

Raises:

ValueError – if value of d is not -1, 0, 1, 2.

bbox_hflip(bbox: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a bounding box horizontally around the y-axis.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

bbox_rot90(bbox: Tuple[float, float, float, float], factor: int, axes: str, rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]
Rotates a bounding box by 90 degrees in the direction dicated by a right-handed coordinate system.
i.e. from a top-level view of the xy plane:

Rotation around the z-axis: counterclockwise rotation Rotation around the y-axis: left to right rotation Rotation around the x-axis: bottom to top rotation

Parameters:
  • bbox – A bounding box tuple (x_min, y_min, z_min, x_max, y_max, z_max).

  • factor – Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.

  • axes – The axes that define the axis of rotation. Must be in {‘xy’,’yz’,’xz’}

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image depth.

Returns:

A bounding box tuple (x_min, y_min, z_min, x_max, y_max, z_max).

Return type:

tuple

bbox_rotate(bbox: Tuple[float, float, float, float], angle: float, method: str, axes: str, crop_to_border: bool, rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Rotates a bounding box by angle degrees.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_min).

  • angle – Angle of rotation in degrees.

  • axes – The axis of rotation. Must be one of {‘xy’, ‘xz’, ‘yz’}.

  • crop_to_border – If True, bbox is normalized to fit new image shape. See rotate(crop_to_border=True)

  • method – Rotation method used. Should be one of: “largest_box”, “ellipse”. Default: “largest_box”.

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_min).

References

https://arxiv.org/abs/2109.13488

bbox_shift_scale_rotate(bbox, angle, scale, dx, dy, dz, axes='xy', crop_to_border=False, rotate_method='largest_box', rows=0, cols=0, slices=0, **kwargs)[source]

Rotates, shifts and scales a bounding box. Rotation is made by angle degrees, scaling is made by scale factor and shifting is made by dx and dy.

Parameters:
  • bbox (tuple) – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • angle (int) – Angle of rotation in degrees.

  • scale (int) – Scale factor.

  • dx (int) – Shift along x-axis.

  • dy (int) – Shift along y-axis.

  • dz (int) – Shift along z-axis.

  • axes – The axis of rotation. Must be one of {‘xy’, ‘xz’, ‘yz’}.

  • crop_to_border – If True, bbox is normalized to fit new image shape. See rotate(crop_to_border=True)

  • rotate_method (str) – Rotation method used. Should be one of: “largest_box”, “ellipse”. Default: “largest_box”.

  • rows (int) – Image rows.

  • cols (int) – Image cols.

  • slices (int) – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

bbox_transpose(bbox: Tuple[float, float, float, float], axis: int, rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Transposes a bounding box along given axis.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • axis – 0 - main axis, 1 - secondary axis.

  • rows – Image rows.

  • cols – Image cols.

Returns:

A bounding box tuple (x_min, y_min, z_min, x_max, y_max, z_max).

Raises:

ValueError – If axis not equal to 0 or 1.

bbox_vflip(bbox: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a bounding box vertically around the x-axis.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

Return type:

tuple

bbox_zflip(bbox: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a bounding box on the z-axis.

Parameters:
  • bbox – A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

  • rows – Image rows.

  • cols – Image cols.

  • slices – Image slices

Returns:

A bounding box (x_min, y_min, z_min, x_max, y_max, z_max).

hflip(img: ndarray) ndarray[source]

Hortizontally flips an array

keypoint_flip(keypoint: Tuple[float, float, float, float], d: int, rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a keypoint either vertically, horizontally, along the slice axis, or all depending on the value of d.

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • d – Number of flip. Must be -1, 0 or 1: * 0 - vertical flip, * 1 - horizontal flip, * 2 - z-axis flip, * -1 - vertical, horizontal, and z-axis flip.

  • rows – Image height.

  • cols – Image width.

  • slices – Image depth

Returns:

A keypoint (x, y, z, angle, scale).

Raises:

ValueError – if value of d is not -1, 0, 1 or 2.

keypoint_hflip(keypoint: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a keypoint horizontally around the y-axis.

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • rows – Image height.

  • cols – Image width.

  • slices – Image depth

Returns:

A keypoint (x, y, z, angle, scale).

keypoint_rot90(keypoint: Tuple[float, float, float, float], factor: int, axes: str, rows: int, cols: int, slices: int, **params) Tuple[float, float, float, float][source]
Rotates a bounding box by 90 degrees in the direction dicated by a right-handed coordinate system.
i.e. from a top-level view of the xy plane;
  • Rotation around the z-axis; counterclockwise rotation

  • Rotation around the y-axis; left to right rotation

  • Rotation around the x-axis; bottom to top rotation

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • factor – Number of CCW rotations. Must be in range [0;3] See np.rot90.

  • axes – The axes that define the axis of rotation. Must be in {‘xy’,’yz’,’xz’}

  • rows – Image height.

  • cols – Image width.

Returns:

A keypoint (x, y, z, angle, scale).

Return type:

tuple

Raises:

ValueError – if factor not in set {0, 1, 2, 3}

keypoint_rotate(keypoint, angle: float, axes: str, crop_to_border: bool, rows: int, cols: int, slices: int, **params)[source]

Rotate a keypoint by angle.

Parameters:
  • keypoint (tuple) – A keypoint (x, y, z, angle, scale).

  • angle (float) – Rotation angle.

  • axes – The axis of rotation. Must be one of {‘xy’, ‘xz’, ‘yz’}.

  • crop_to_border – If True, bbox is normalized to fit new image shape. See rotate(crop_to_border=True)

  • rows (int) – Image height.

  • cols (int) – Image width.

  • slices – Image slices

Returns:

A keypoint (x, y, z, angle, scale).

Return type:

tuple

keypoint_scale(keypoint: Tuple[float, float, float, float], scale_x: float, scale_y: float, scale_z: float) Tuple[float, float, float, float][source]

Scales a keypoint by scale_x and scale_y.

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • scale_x – Scale coefficient x-axis.

  • scale_y – Scale coefficient y-axis.

  • scale_z – Scale coefficient y-axis.

Returns:

A keypoint (x, y, z, angle, scale).

keypoint_shift_scale_rotate(keypoint, angle, scale, dx, dy, dz, axes='xy', crop_to_border=False, rows=0, cols=0, slices=0, **params)[source]

Applies an affine transform to a keypoint

Parameters:
  • keypoint (KeypointInternalType) – A keypoint

  • angle (float) – an angle in degrees

  • scale (float) – the value to scale the keypoint’s size by

  • dx (float) – shift factor for width

  • dy (float) – shift factor for height

  • dz (float) – shift factor for depth

  • axes (str) – the axes to rotate along

  • crop_to_border (bool) – If True, then the image is padded or cropped to fit the entire rotation. If False, then original image shape is maintained and some portions of the image may be cropped away. Note that any translations are applied after the image is reshaped.

  • rows (int) – Image width

  • cols (int) – Image height

  • slices (int) – Image depth

keypoint_transpose(keypoint: Tuple[float, float, float, float]) Tuple[float, float, float, float][source]

Rotate a keypoint by angle.

Parameters:

keypoint – A keypoint (x, y, z, angle, scale).

Returns:

A keypoint (x, y, z, angle, scale).

keypoint_vflip(keypoint: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a keypoint vertically around the x-axis.

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • rows – Image height.

  • cols – Image width.

  • slices – Image depth

Returns:

A keypoint (x, y, z, angle, scale).

Return type:

tuple

keypoint_zflip(keypoint: Tuple[float, float, float, float], rows: int, cols: int, slices: int) Tuple[float, float, float, float][source]

Flip a keypoint along the z-axis.

Parameters:
  • keypoint – A keypoint (x, y, z, angle, scale).

  • rows – Image height.

  • cols – Image width.

  • slices – Image depth

Returns:

A keypoint (x, y, z, angle, scale).

longest_max_size(img: ndarray, max_size: int, interpolation: int) ndarray[source]

Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image

Parameters:
  • img (np.ndarray) – an image

  • max_size (int) – the maximum side length of the image after resizing

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST)

pad(img: ndarray, min_height: int, min_width: int, min_depth: int, border_mode: int = 'constant', value: float | int = 0) ndarray[source]

Pad an image.

Parameters:
  • img (np.ndarray) – an image

  • min_height (int) – The minimum height to pad to, if applicable

  • min_width (int) – The minimum width to pad to, if applicable

  • min_depth (int) – The minimum depth to pad to, if applicable

  • border_mode (str) –

    Scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    See https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html Default: constant.

  • value (int or float) – padding value if border_mode is “constant”.

pad_with_params(img: ndarray, h_pad_top: int, h_pad_bottom: int, w_pad_left: int, w_pad_right: int, d_pad_front: int, d_pad_back: int, border_mode: str = 'constant', value: float | int = None) ndarray[source]

Pad an image.

Parameters:
  • img (np.ndarray) – an image

  • h_pad_top (int) – The number of pixels to pad on the top in the height dimension

  • h_pad_bottom (int) – The number of pixels to pad on the bottom in the height dimension

  • w_pad_left (int) – The number of pixels to pad on the left in the width dimension

  • w_pad_right (int) – The number of pixels to pad on the right in the width dimension

  • d_pad_front (int) – The number of pixels to pad on the front in the depth dimension

  • d_pad_back (int) – The number of pixels to pad on the back in the depth dimension

  • border_mode (str) –

    Scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    See https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html Default: constant.

  • value (int or float) – padding value if border_mode is “constant”.

py3round(number)[source]

Unified rounding in all python versions.

resize(img: ndarray, height: int, width: int, depth: int, interpolation: int = 1)[source]

Resizes an image

Parameters:
  • img (np.ndarray) – an image

  • height (int) – desired height of the output.

  • width (int) – desired width of the output.

  • depth (int) – desired depth of the output.

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

rotate(img: ndarray, angle: float, axes: str, crop_to_border: bool = False, interpolation: int = 1, border_mode: int = 'constant', value: float | int = 0)[source]

Rotates an image by angle degrees.

Parameters:
  • img – Target image.

  • angle – Angle of rotation in degrees.

  • axes – The axis of rotation. Must be one of {‘xy’, ‘xz’, ‘yz’}.

  • crop_to_border – If True, then the image is cropped or padded to fit the entire rotation. If False, then original image shape is maintained and some portions of the image may be cropped away. Default: False

  • interpolation – scipy interpolation method (e.g. dicaugment.INTER_NEAREST).

  • border_mode (str) –

    scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

    Default: constant

  • value – The fill value when border_mode = constant. Default: 0

Returns:

Image

scale(img: ndarray, scale: float | Tuple[float], interpolation: int = 1) ndarray[source]

Scales an image.

Parameters:
  • img (np.ndarray) – an image

  • scale (float, tuple) – Scaling factor. If a tuple, then each dimension of image is scaled by respective element in tuple

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST). Default: dicaugment.INTER_LINEAR

shift_scale_rotate(img, angle, scale, dx, dy, dz, axes='xy', crop_to_border=False, interpolation=1, border_mode='constant', value=0)[source]

Applies an affine transform to am image

Parameters:
  • img (np.ndarray) – An image

  • angle (float) – an angle in degrees

  • scale (float) – the factpr to scale the image by

  • dx (float) – shift factor for width

  • dy (float) – shift factor for height

  • dz (float) – shift factor for depth

  • axes (str) – the axes to rotate along

  • crop_to_border (bool) – If True, then the image is padded or cropped to fit the entire rotation. If False, then original image shape is maintained and some portions of the image may be cropped away. Note that any translations are applied after the image is reshaped.

  • interpolation – scipy interpolation method (e.g. dicaugment.INTER_NEAREST).

  • border_mode (str) –

    scipy parameter to determine how the input image is extended during convolution to maintain image shape. Must be one of the following:

    • reflect (d c b a | a b c d | d c b a): The input is extended by reflecting about the edge of the last pixel. This mode is also sometimes referred to as half-sample symmetric.

    • constant (k k k k | a b c d | k k k k): The input is extended by filling all values beyond the edge with the same constant value, defined by the cval parameter.

    • nearest (a a a a | a b c d | d d d d): The input is extended by replicating the last pixel.

    • mirror (d c b | a b c d | c b a): The input is extended by reflecting about the center of the last pixel. This mode is also sometimes referred to as whole-sample symmetric.

    • wrap (a b c d | a b c d | a b c d): The input is extended by wrapping around to the opposite edge.

    Reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html

    Default: constant

  • value – The fill value when border_mode = constant. Default: 0

smallest_max_size(img: ndarray, max_size: int, interpolation: int) ndarray[source]

Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:
  • img (np.ndarray) – an image

  • max_size (int) – the minimum side length of the image after resizing

  • interpolation (int) – scipy interpolation method (e.g. dicaugment.INTER_NEAREST)

transpose(img: ndarray) ndarray[source]

Transposes an images dimensions

vflip(img: ndarray) ndarray[source]

Vertically flips an array

zflip(img: ndarray) ndarray[source]

Flips an array along the slice dimension