The ultimate goal of the bootcamp is to cultivate strong data science skills with an emphasis on machine learning techniques to satisfactorily meet and exceed the requests of the Data science world. In the process, we will develop good habits for operating independently as data scientists and for operating as members of productive data science teams.

June 7, 2023 - July 4, 2023
Language: English
Beginner, Data Science, Junior, Programming

Course information

Why is the topic relevant? Why is it on everyone's mind?

Having data science and machine learning skills nowadays can potentially increase your success chances, whether that be as an individual or a business. Many industries offer their employees the opportunity to enroll in upskilling programs. In that way, domain experts can leverage the knowledge in their given field and seek higher roles in their company. As the demand for data science skills rises higher and higher, having a rounded understanding of data science and applying that knowledge practically can help widen your scope of knowledge.

What will be taught in the course?

  • Answering the questions "What is data science?" and "Why is it relevant for your work?"
  • Data Analysis and making sense of the data you have through basic statistics (uni-, bi- and multivariate statistics)
  • Planning and executing an exploratory data analysis (EDA) for new data
  • The use of libraries such as NumPy, Pandas, and Matplotlib for interesting data visualizations
  • Leveraging the power of Machine Learning (ML) through KNN, Decision Trees and other techniques to quickly apply existing ML models

How will it be taught?

  • Videos: We prepared various videos to teach you the basic of data science. Additionally, we made sure to embed the coding environment - Jupyter Labs - into our hands-on videos.
  • Self-tests & Quizzes: "Have you understood the concepts?". This question will be answer by our self-test. You can repeat them and make sure you have a solid understanding before moving on to the exercises.
  • Exercises on the platform: Apply what you learned! As the old saying goes, one has to use in order "to make it stick". By coding yourself, your practical knowledge will match the understood theory. You don't have to install anything, as we provide you with the necessary tools on the platform!
  • Live-Streams: We will be live on YouTube and Vimeo on various dates: Always on Mondays (12.6, 19.6, 26.6 and 3.7), starting from 4pm on wards. These sessions will be around 2hours long and made available in the course after the broadcast.

How is this course structured?

We organized the course in six weeks and two tracks:
The CORE track
- Week 1 of the course is all about the „Why“. It outlines data science, its relevance, and its potential.
- Week 2 is about the „How“. Different approaches to analyzing your data to find meaningful relations and perform a well-rounded EDA.
- Week 3 dives into various Machine Learning algorithms and techniques that are essential in any data science project.
- Week 4 is the final exam week. After week 4, the core track ends.

The PROJECT track
- Week 5 & 6 is where the real experience starts and you are invited to continue with two extra weeks to work on a real-life challenge with dedicated tools.
We encourage you to sign up and enroll for the total six week experience to foster a solid understanding and apply your new skills!

Here's a tentative timeline, deadlines, and other dates might still be subject to change. timeline

Who should take this course?

  • People with basic Python knowledge. That includes variables, conditional statements, while loops, and data structures.
  • People with domain knowledge that need to apply modern data analysis in their daily workload.

What needs to be accomplished for the course certificate?

  • Jupyter notebook exercises: As outlined above, we give you plenty of opportunity to code.
  • Weekly exercises and challenges: Each week, we will have an assignment, that will be part of the overall score you can achieve in this course.
  • Final assignment: At the end of the CORE track, a final assignment, similar to a final exam, will test your overall knowledge. Further information and study tips will be given throughout the course. We don't want to trick our learners - if you follow the material and watch the videos, you will be setup for success.
  • Exposure to real-life scenarios and datasets ("no easy data"): By that, we mean the usage of actual data set, we would teach and code with outside of this online course. We are convinced that actual datasets you find in your daily life is a better learning experience than squeaky-clean data that is already optimized. Don't worry - we will walk you trough the processes, so that you will be able to clean and normalize data yourself.

How much time is expected to be spent?

The workload for the course is approximately 5 - 7 hours per week, depending on prior knowledge. It is hard to give a distinct number, given that some topic might be easier for your or spark an unknown desire to learn more, this number could be higher. As we are aware that this course is done "next" to an existing work, study or other load, we balance your time resources accordingly.

We are looking forward to see you in the course!

What you'll learn

  • What is Jupyter Notebooks and how to use it for Data Science
  • Work with real-life datasets and apply Numpy, Pandas and Matplotlib
  • Use scikit-learn to create powerful ML models

Who this course is for

  • High School and College Students
  • Domain Experts

Course contents

  • Introduction - Course Overview and Housekeeping:

    In this first short section, we want to give you an overview of the course, learn more about you and start the bootcamp.
  • Week 1 (Intro to Data Science and Visualization):

    We start this week with an overview of the data science cycle, combined with the reasons and perspectives you need as a data scientists. Then, we introduce some general variable types and some visualization approaches. The week ends with an introduction to basic database queries.
  • Week 2 (EDA and Statistical Analysis):

    What is an EDA - and how can you apply it? We focus on this question, while giving you plenty of chances to use your new knowledge about fundamentals of statistical models.
  • Week 3 (Machine Learning):

    We reach the machine learning section of the course! You will learn what often used terms mean and how to enhance your data models with existing libraries.
  • Week 4 (Final Exam):

    This week is for your final exam: One assignment to finish in one sitting, testing your knowledge for the CORE track. After that, we continue with the PROJECT track in week 5 and 6.
  • Week 5 (Real-life Project):

    The chance to put it all together! You will work on a project for a week, combing various elements, commands and lessons you learned in the CORE track. The true test of this data science bootcamp.
  • Week 6 (Peer Review):

    You will review the project submission of others, who have worked on the same assignment for the last week as you did. You feedback and evaluation will be used to grade the work of others. Only if you participate in both phases - submitting your own project and reviewing another project - you get feedback and a grade on the PROJECT track.

Enroll me for this course

The course is free. Just register for an account on openHPI and take the course!
Enroll me now
Learners enrolled: 3484

Certificate Requirements

  • Gain a Record of Achievement by earning at least 51% of the maximum number of points from all graded assignments.
  • Gain a Confirmation of Participation by completing at least 50% of the course material.

Find out more in the certificate guidelines.

This course is offered by

Mohamed Elhayany

Mohamed has his Master's degree in the field of Communication Technology Engineering from the University of Ulm. He is now a Ph.D. candidate and part of the openHPI research team with a focus on auto-assessment of programming exercises in MOOCs. Mohamed is currently working on integrating Jupyter notebooks with openHPI to provide supportive learning environments. In his leisure time, Mohamed likes to go to the gym, watch football and travel the world.

Hendrik Steinbeck

Hendrik is part of the openHPI research team with a focus on videobased learning. After earning his degree in the field of information systems, he worked in the banking and technology sector. Beyond of cutting software, studio-setups and designing curricula, he can be found in the local climbing gym.