rn a [`~utils.ModelOutput`] instead of a plain tuple. z]The bare ViT Model transformer outputting raw hidden-states without any specific head on top.c