Data Engineering und Data Science

Course information

Die Schlagwörter Künstliche Intelligenz, Data Science, Data Engineering, und Big Data dominieren seit einigen Jahren nicht nur die IT-Schlagzeilen. In unserem Kurs wollen wir diese Wörter mit grundlegendem Inhalt füllen und die typischen Arbeitsschritte eines Data Scientists nachvollziehen. Insbesondere schauen wir hinter die Kulissen und betrachten den oft mühsamen Weg der Daten bis sie endlich genutzt werden können um z.B. mittels maschinellem Lernen Modelle trainieren zu können. Dazu gehören die Datenbeschaffung, die Datenreinigung, und die Datenintegration. Anschließend lernen wir, wie man aus diesen Daten und auch aus Texten neue Erkenntnisse mittels Data Mining und maschinellem Lernen gewinnt. Der Abschluss bildet eine Diskussion über Ethik und Fairness bei der automatisierten Datenanalyse.

Zielgruppe

Interessierte Öffentlichkeit, PraktikerInnen und Bachelorstudierende

Kursstruktur

Woche 1: Big Data und Data Science
Woche 2: Data Science Anwendungen und Text Mining
Woche 3: Skalierbares Datenmanagement
Woche 4: Datenaufbereitung
Woche 5: Informationsintegration
Woche 6: Statistik, Data Mining, Machine Learning
Woche 7: Klausur

Arbeitsaufwand

Der Arbeitsaufwand für diesen Kurs entspricht 2 ECTS-Punkten.

Podcast-Empfehlung

Mehr zum Thema Data Engineering erfahren Sie auch in der aktuellen Folge des Neuland Podcast.

Achtung: Dieser Kurs befindet sich aktuell im Selbststudium-Modus, in dem Sie keinen Zugriff auf die bewerteten Hausaufgaben/Prüfungen haben. Daher können wir Ihnen lediglich eine Teilnahmebestätigung ausstellen.

Enroll me for this course

The course is free. Just register for an account on openHPI and take the course!

Enroll me now

Learners

Current

Today

21,816

Course End

Feb 26, 2020

14,654

Course Start

Jan 08, 2020

11,346

Certificate Requirements

Gain a Record of Achievement by earning at least 50% of the maximum number of points from all graded assignments.
Gain a Confirmation of Participation by completing at least 50% of the course material.

Find out more on our page for certificates and guidelines.

This course is offered by

Prof. Dr. Felix Naumann

Prof. Felix Naumann studied mathematics, economy, and computer sciences at the University of Technology in Berlin. After receiving his diploma (MA) in 1997 he completed his PhD thesis in the area of data quality at Humboldt University of Berlin in 2000. In 2001 and 2002 he worked at the IBM Almaden Research Center on data integration topics. From 2003 - 2006 he was assistant professor for information integration, again at the Humboldt-University of Berlin. Since 2006 he holds the chair for information systems at the Hasso Plattner Institute at the University of Potsdam in Germany. He has been visiting researcher at QCRI in Qatar, AT&T Research in New York, and IBM Research in California. His research interests include data profiling, data cleansing, and text mining. Next to numerous PC memberships for international conferences, he has organized several conferences in various roles, he is editor-in-chief for the Information Systems journal and trustee of the VLDB Endowment. More details are at https://hpi.de/naumann/people/felix-naumann.html .