Non-Hierarchical Transformers for Pedestrian Segmentation
Abstract: We propose a methodology to address the challenge of instance segmentation in autonomous systems, specifically targeting accessibility and inclusivity. Our approach utilizes a non-hierarchical Vision Transformer variant, EVA-02, combined with a Cascade Mask R-CNN mask head. Through fine-tuning on the AVA instance segmentation challenge dataset, we achieved a promising mean Average Precision (mAP) of 52.68\% on the test set. Our results demonstrate the efficacy of ViT-based architectures in enhancing vision capabilities and accommodating the unique needs of individuals with disabilities.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.