This is for aspiring candidates of the AI Apprenticeship Programme (AIAP) by AI Singapore.
I have compiled a list of must-learn topics based on my own experience of doing Assessment 1’s multiple choice questions (MCQs) in June 2022 (Batch 11) and September 2022 (Batch 12). This study guide is not meant to be some kind of shortcut, but my intention is to help candidates focus their studies on those topics that seem to matter the most in Assessment 1.
This listing is non-exhaustive, meaning other new topics can appear in future, and in fact will likely do so.
Bear in mind that the format of Assessment 1 has changed before and can change again. The current format of Assessment 1 is 15 MCQs to be answered within 40 minutes (Batch 12). Passing mark is 80%, or 12 out of 15 correct.
Topics to study
- Difference between regression and classification.
- Difference between supervised vs unsupervised learning. Less important but have come up before: self-supervised learning and reinforcement learning.
- Difference between dummy variable encoding, ordinal encoding and one-hot encoding for handling categorical data. The appropriate uses of each. On a related note, geocoding.
- Data scaling, standardization and normalization. Its purpose to optimize data for distanced-based algorithms, which includes many algorithms such as linear and logistic regression, KNN, SVM and neural networks. (Tree-based algorithms including decision trees, random forests, XGBoost, LightGBM and Catboost are not distance-based, do not benefit from such preprocessing and are unaffected by it.)
- Know what issues might come up when dealing with differences in train and test datasets, e.g. different categorical values appearing under the same column.
- The appropriate uses of common ML algorithms such as linear regression, logistic regression, SVM, decision trees, gradient boosted decision trees, random forests, K-nearest neighbours, K-means clustering, and ensemble models.
- Difference between overfitting and underfitting, and what to do in each case. Related topics such as stratification, regularization, and model complexity.
- Difference between evaluation metrics such as accuracy; precision vs recall; FI score; ROC AUC; mean squared error vs root mean squared error vs mean absolute error; true positives, false positives, true negatives and false negatives.
- Git concepts such as pull requests, merge conflicts, branches, commits, diff, checkout, clone, add, etc. and what commands do what. Experience in using Github for an actual project is highly recommended.
- Object-oriented programming (OOP) in Python. Be able to look at actual OOP code and understand what’s going on. Know about inheritance, instances, self, the __init__() method, parent class (superclass), and child class (subclass).
- Be familiar with the relatively new Python syntax for function annotations.
- Be able to look at a Python function or class and, given certain input arguments, know what the return value will be or what will be printed. Or know what arguments are needed to execute the function properly. Or what is the value of a certain variable in a function or method when the code is executed.
Additional notes:
Assessment 1 does not cover SQL because SQL will be tested when you do Assessment 2.
These topics don’t seem like a lot, but they actually cover quite a broad range of knowledge. Some useful study material and courses I personally recommend:
- Google Machine Learning Crash Course
And its related Glossary. - Datacamp career track: Machine Learning Scientist with Python
(Note: this is a paid course. You can sign up for AISG’s premium membership which comes with a discount code for Datacamp. This is actually cheaper than the full price of Datacamp.) - AI200 by Heicoders Academy.
This course is an 8-week commitment and is pretty intensive. Most if not all of the fees can be covered by SkillsFuture. I have not met anyone who has done this course and not found it useful.
If you want to join a serious and committed AIAP study group, join my Discord server here and drop me a direct message on Discord. (If you are new to Discord, don’t just preview the server, but register an account with your email address.)
https://discord.gg/zuNUDB5TTM
Or drop me an email via my contact page.