Professional Documents
Culture Documents
Hierarchical Vision
Transformer using
Shifted Windows
Discover how Swin Transformer improves the efficiency and accuracy
of vision transformers for image recognition.
by Lakshya Karwa
Introduction
1 Vision Transformers Overview 2 The Swin Transformer Problem
What are vision Transformers and Why How does Swin Transformer address the
are they important and what are their limitations of traditional vision
limitations? transformers?
A review of recent research and Key findings from the Swin Transformer
innovations in the field of vision model and how they address the
transformers. problem.
Vision Transformers Overview
• Vision transformers - type of NN – for visual recognition tasks - capture long-range dependencies in images –
difficult for general CNNs
• Limitations?
➢ Computationally expensive
➢ Require more data for robust training
• Swin Transformer – combines CNNs and Transformers – feature maps – image patches
• Complexity? – linear? How?
• Recent innovations – Use of Attention mechanism to improve efficiency and unsupervised learning techniques
• Research highlight – high accuracy, less time complexity, exceptional results
Vision Transformer Vs Swin Transformer
Swin Transformer Architecture
How does the Swin Evaluation of the Swin What factors may affect
Transformer compare to Transformer's the Swin Transformer's
other models in terms of performance on Img clf. on performance and how can
accuracy and efficiency? ImgNet and Obj Detection they be addressed?
on COCO.
Application and Future Work
How can Swin Transformer be What are some potential areas What are some possible
used in industries ranging for expanding the Swin research questions and areas
from healthcare to Transformer model? for improvement?
advertising?
Shortfalls
1 Hardware 2 Computational
Requirements Resources
Are there any specific dataset and data size requirements when
it comes to training and using the model?
Conclusion
Impact Future Potential
How does the Swin Transformer contribute to What are the potential benefits and implications
the field of computer vision and artificial of improved image recognition technology?
intelligence?