|  | --- | 
					
						
						|  | datasets: | 
					
						
						|  | - imagenet-1k | 
					
						
						|  | pipeline_tag: image-classification | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## Model Architecture Details | 
					
						
						|  |  | 
					
						
						|  | ### Architecture Overview | 
					
						
						|  |  | 
					
						
						|  | - **Architecture**: ViT Tiny | 
					
						
						|  |  | 
					
						
						|  | ### Configuration | 
					
						
						|  |  | 
					
						
						|  | | Attribute            | Value          | | 
					
						
						|  | |----------------------|----------------| | 
					
						
						|  | | Patch Size           | 16             | | 
					
						
						|  | | Image Size           | 224            | | 
					
						
						|  | | Num Layers           | 1              | | 
					
						
						|  | | Attention Heads      | 4              | | 
					
						
						|  | | Objective Function   | CrossEntropy   | | 
					
						
						|  |  | 
					
						
						|  | ### Performance | 
					
						
						|  |  | 
					
						
						|  | - **Validation Accuracy (Top 5)**: 0.33 | 
					
						
						|  | - **Validation Accuracy**: 0.16 | 
					
						
						|  |  | 
					
						
						|  | ### Additional Resources | 
					
						
						|  |  | 
					
						
						|  | The model was trained using the library: [ViT-Prisma](https://github.com/soniajoseph/ViT-Prisma).\ | 
					
						
						|  | For detailed metrics, plots, and further analysis of the model's training process, refer to the [training report](https://wandb.ai/perceptual-alignment/Imagenet/reports/ViT-Small-Imagenet-training-report--Vmlldzo3MDk3MTM5). | 
					
						
						|  |  |