Data and analytics are now key to business decision making, driving research, and automating human tasks.  Many surveys rank data scientist as one of the best jobs in the America and one of the highest paid.  This Data Science Hackathon & Workshop (DSHW) aims to provide deeper hands-on experience with modular workshops and a hackathon for those with prior experience, and a crash-course on data science and machine learning for those with no prior experience.


The Data Science Hackathon & Workshop will be held on the afternoon of Saturday, August 17, 2019 from 1pm-6pm at the end of UKC.  There will be 3 sets of parallel sessions of introductory level, intermediate level, and an instructional team hackathon competition, from which you may choose to attend any session or participate in the hackathon.  


Planned hackathon and workshop schedule on Saturday, August 17, 2019:

Time\ Track Introductory Data Science Workshops Intermediate Data Science Workshops* Data Science Hackathon*
1:00 –

3:15 PM

Introduction to Data Science Data Visualization and Tableau* Hackathon is an all-afternoon instructional team competition with physical workspace available during this time*
3:30 –

5:45 PM

Introduction to Machine Learning Image Classification using Convolutional Neural Networks*

*Prior Python programming experience required.  Prior experience questions can be emailed to


Summary of Program and Registration:

  • For a flat add-on registration fee of $75 (see student discount below), participants may sign up for up to any 2 workshop sessions of your choice or programming experience level and/or enroll in the day-long hackathon.  That is a value of $37.50 per 2-hr workshop session. For late and onsite registration email us at

Workshop Sessions Information:

  • Each workshop is modular and any registrant may attend any 2 workshops of the ones offered and may also participate in the optional hackathon.
  • All workshops will be fully interactive with participants programming in between mini-lectures spaced out during the workshop.
  • Python programming language experience is required for the intermediate workshops and the hackathon.
  • Only for the introductory workshops, no prior programming experience is necessary.
  • All participants must bring his/her own laptop with pre-installed software (provided).  WiFi access will be made available.
  • Electronics certificates will be provided to those who complete a workshop and receive  a satisfying score on a final quiz assessing the participants understanding of the topic.
  • “Introduction to Data Science” (Lead Instructor: Ahreum Amy Han)
    • You will learn an overview of the data science process and the basics of exploratory data analysis, data cleaning, data storage, data import and export, linear regression, and correlation analysis in an interactive session
  • “Introduction to Machine Learning” (Lead Instructor: DK Kim)
    • You will learn an overview the various machine learning models such as multivariate analysis, random forest, support vector machines, text mining, and neural networks, including both supervised and unsupervised methods, along with their applications, in an interactive session.
  • “Data Visualization and Tableau” (Lead Instructor: Jeho Park)
    • You will learn data visualization to communicate information clearly and efficiently, using statistical graphics, plots, information graphics and other tools.  You will learn to use Tableau, a popular interactive data visualization software, in a hands-on interactive session. Prior Tableau experience is not necessary, but Python programming experience is recommended.
  • “Image Classification using Convolutional Neural Networks” 
    (Lead Instructor: Benjamin Lee)
    • You will learn how to build an image classifier using supervised machine learning and state-of-the-art convolutional neural networks, to automatically label images, for example for handwriting recognition or distinguishing cats from dogs.  Prior Python programming experience is required.


Hackathon Program Information:

  • Beginners are welcome, as the team hackathon project will also be instructional.
  • Experienced instructors, Albert Lee, Karl Kwon, and Benjamin Lee, will guide hackathon participants during the workshop time.
  • Teams will be formed before the event with balanced skill sets.
  • You are eligible to participate if you have one of the following skills:
    • Python ML experience
    • Data analysis experience including in Excel
    • Data visualization experience,
    • OR strong presentation or problem solving skills (not everyone needs to code).
  • All participants must bring his/her own laptop with pre-installed software (provided).
  • Certificates provided for completing team presentation and winners announced on web.
  • Note you must choose either the hackathon or the workshops as they run parallel.
  • Hackathon schedule is:
    • 1pm-2pm Hackathon Instructions and Dataset Release
    • 2pm-4pm Exploratory Data Analysis and Machine Learning Model Coding
    • 4pm-5pm Slides Preparation
    • 5pm-6pm Team Presentations


Registration Information:

  • DSHW registration is $75 for regular participants and $50 for students (graduate, professional, undergraduate).
  • UKC registration is required to add-on the DSHW registration.
  • DSHW registration is within the general UKC registration.
  • Late DSHW registration is available upon request.  Email at
  • Workshop session preferences for each workshop will be sent out after UKC registration.
  • Minimum of 4 sign-ups required for each workshop to be held otherwise they are subject to cancellation.


Further questions can be emailed to


Instructors and Organizers:

Ahreum Amy Han (Chair, Instructor) –
Business Data Analyst at IBM and Lecturer at Southern Illinois University at Carbondale:  Ahreum Amy Han is a Statistician/Data Scientist and currently with IBM and formally with Allstate Corporation and Underwriters Laboratories (UL) Inc. Amy currently is a lecturer at Southern Illinois University at Carbondale for their School of Information Systems and Applied Technologies and a career advisory board member at Department of Mathematical Science for graduate studies of applied mathematics and applied statistics at DePaul University. Amy Han earned her Bachelor of Science in General Mathematics and Master of Science in Applied Mathematics.


DK Kim (Co-Chair, Instructor) – Data Scientist at Edison Energy: DK Kim is currently a data scientist at Edison Energy in Boston, MA. He received his BS in Industrial Engineering with minors in Statistics and Mechanical Engineering from Texas Tech. He worked as a data analyst in healthcare prior to completing a Data Science coding bootcamp to learn Python. DK enjoys traveling and his hobby includes loyalty and frequent flyer program.



Jeho Park (Instructor) – Director of the Quantitative and Computing Lab at Claremont McKenna College: Jeho Park is the founding Director of the Quantitative and Computing Lab at Claremont McKenna College. He leads the center to assist students and faculty with quantitative, statistical, and computational skills through tutoring, workshops, and consultations. He also teaches high performance computing and data science courses. He received a Ph.D. in Engineering and Applied Mathematics/Computer Science from Claremont Graduate University. Dr. Park’s primary research and professional interests include Data Science, Data Analytics and Quantitative Methods, AI/Machine Learning, and High-Performance Computing.


Albert Lee (Instructor) – Bioinformatics Scientist at Myriad Women’s Health: Albert Lee is a Bioinformatics Scientist at Myriad Women’s Health, a genetic screening company that provides actionable information that guides critical health decisions for women and their families. He specializes in developing robust, scalable bioinformatics pipeline and analyzing large scale genetic data using novel statistical methods and the state-of-the-art scientific computing. Albert received his Ph.D. in Biomedical Informatics at Columbia University.


Karl Kwon (Instructor) – Cloud Data Engineer at Blackboard Insurance: Karl Kwon is a cloud data engineer specializing in data visualization at Blackboard Insurance in New York City area. He loves data and the insights that can be found inside. He develops compelling, intuitive, and functional data visualization tools through statistical optimization and ML models. He holds a Ph.D. in Computer Science from the University of Houston where he developed a powerful data visualization tool of scientific and academic careers called ScholarPlot. Also, he earned his M.S. in Computer Science and a B.S. in Software Engineering, respectively.


Benjamin Lee (Organizer, Instructor) – Senior Research Associate at Weill Cornell Medicine: Benjamin Lee is a Sr. Research Associate at Weill Cornell Medicine.  Ben is a researcher developing machine learning algorithms for cardiac medical imaging focusing on convolutional neural networks for 3D SPECT/PET/CT images for disease detection. Ben received his Ph.D. at the University of Michigan in Electrical Engineering specializing in image processing and image reconstruction and his B.S. from Cornell University.


Stella Chun (Organizer) – Biosciences Account Manager at Thermo Fisher Scientific.  Stella Chun is a bioscience account manager in Life Sciences Solution at Thermo Fisher Scientific.  Stella has led several KSEA career development workshops and events.