e key "boxes" containing a tensor of dim [nb_target_boxes, 4]. The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size. r3