Bitte melden Sie sich an, um fortzufahren.

Kurs im Selbststudium

Practical Computer Vision with PyTorch

Angeboten von Antonio Rueda-Toicen

Beim Laden des Videoplayers ist ein Fehler aufgetreten, oder es dauert lange, bis er initialisiert wird. Sie können versuchen, Ihren Browser-Cache zu leeren. Bitte versuchen Sie es später noch einmal und wenden Sie sich an den Helpdesk, wenn das Problem weiterhin besteht.

Practical Computer Vision in PyTorch is a comprehensive, hands-on course for developers and practitioners eager to explore computer vision with PyTorch. It spans image classification, object detection, segmentation, and generative modeling. Emphasizing implementation, participants work through coding demos and projects with industry-standard tools and libraries. By the end, they will be able to build and fine-tune computer-vision models for real-world applications.

Seit 21. Mai 2025 im Selbststudium
Kurssprache: English
English, Deutsch
Advanced, Big Data and AI, Data Science

Kursinformationen

Computer-vision technologies are transforming industries, driving innovation in healthcare, automotive, retail, and media. Effectively applying these techniques demands deep practical knowledge. This course offers a comprehensive introduction to modern computer-vision methods with PyTorch, beginning with convolutional neural networks (CNNs), moving to advanced architectures such as Vision Transformers (ViT), and exploring cutting-edge vision-language models like CLIP and Grounding DINO. Beyond technical implementation, participants will learn best practices for evaluating and fine-tuning models, ensuring proficiency in every stage of development.

Course Structure
Module 1: Foundations of Deep Learning for Vision
  • CNNs, optimization, metrics, and embeddings
  • Code demos: Zero-shot inference, visualization of CNN operations, data augmentation
Module 2: Advanced Techniques
  • Vision transformers, object detection, segmentation
  • Vision-language models and generative modeling techniques (CLIP, diffusion models)
  • Practical coding and advanced implementation demos
Scope

The Practical Computer Vision with PyTorch course runs for two weeks with a total workload of approximately 8-10 hours. It includes video lectures and interactive coding demonstrations, each accompanied by multiple-choice assessments.

All learning materials (videos, coding demonstrations, self-assessments) are available at the course start. The Module-1 Homework will be released along with the course material, and the Module-2 Homework will be released at the end of the first week, giving learners two weeks to complete the course and submit their solutions.

Prerequisites
  • Intermediate AI/ML understanding
  • Proficiency in Python (writing classes/functions)

Insert image description

Was Teilnehmende lernen werden

  • Fundamental concepts of computer vision and applications
  • Building/training neural networks (CNNs, transformers)
  • Loss functions and optimization techniques
  • Performance evaluation metrics
  • Transfer learning and feature extraction techniques
  • Data augmentation and dataset curation methods
  • Specialized architectures: Vision Transformers, Mask R-CNN, etc.
  • Vision-language models and generative modeling techniques (CLIP, diffusion models)
  • Experiment tracking with Weights and Biases (wandb)
  • Image dataset curation with FiftyOne

Für wen dieser Kurs gedacht ist

  • Students with intermediate AI/ML knowledge
  • Practitioners interested in practical computer vision solutions
  • Developers exploring modern vision architectures (ViT, CLIP, etc.)
  • Hands-on learners comfortable with Python programming
  • AI engineers focusing on generative AI and transfer learning techniques

Lernmaterial

  • Module 1: Foundations of Deep Learning for Vision:

    This week covers fundamental concepts and core neural network topics necessary for practical computer vision using PyTorch. Students learn image representation, tensor operations, neural network basics, training methods, and convolutional architectures.
  • Module 2: Advanced Techniques and Specialized Models:

    This week delves deeper into specialized techniques, optimization, interpretability, image embeddings, vision transformers, segmentation, detection, and image generation.

Für diesen Kurs einschreiben

Der Kurs ist kostenlos. Legen Sie sich einfach ein Benutzerkonto auf openHPI an und nehmen Sie am Kurs teil!
Jetzt einschreiben

Lernende

Aktuell
Heute
2.320
Kursende
21. Mai 2025
1.535
Kursstart
7. Mai 2025
1.243

Bewertungen

Der Kurs wurde mit durchschnittlich 4.27 Sternen bei 220 abgegebenen Stimmen bewertet.

Anforderungen für Leistungsnachweise

  • Den Leistungsnachweis erhält, wer in der Summe aller benoteten Aufgaben mindestens 50% der Höchstpunktzahl erreicht hat.
  • Die Teilnahmebestätigung erhält, wer auf mindestens 50% der Kursunterlagen zugegriffen hat.

Mehr Informationen finden Sie in den Richtlinien für Leistungsnachweise.

Dieser Kurs wird angeboten von

Antonio Rueda-Toicen helps companies and individuals use artificial intelligence. He has experience developing and deploying machine learning models both in industry and academia. Currently, he is a researcher in the Artificial Intelligence and Intelligent Systems group at the Hasso Plattner Institut. He also works as an AI Engineer at Voxel51, where he leads workshops on practical computer vision skills. Antonio is a certified instructor of deep learning and generative models at NVIDIA's Deep Learning Institute.

Since 2019, Antonio has organized the Berlin Computer Vision Group meetup. He has delivered workshops to over 1,000 participants both in person and online. He mentors students at Berlin's Data Science Retreat, helping them transition into industry roles. He enjoys teaching computer vision, MLOps, and neural networks. As an engineer at HPI's AI Service Center, he co-founded the AI Maker Community to support open collaboration.

Antonio is pursuing a PhD at HPI. His focus is on vision-language models and representation learning. He holds degrees in computer science and bioengineering from Universidad Central de Venezuela. Antonio is passionate about making complex technology accessible and useful for everyone.