想请问老师和朋友们,针对面试当中的ML workflow的考察点是什么,要hit哪些点才算达到面试官期望? 尽管自己知道workflow of ML基本流程如下:
Data Exploration: 检查数据类型,查看数据stats, 看distribution和相关性
Feature Engineering: 数据异常处理,missing value, encoding categorical variables(one hot encoding; ordinal encoding)
Standardization or normalization (if needed)
Feature Selection: 1. Completeness of features; 2. Between variables(correlation) 3. Between Independent Variables and the Dependent Variable (filter method; wrapper method; model embedded method)
Model Selection: baseline selection[是通过baseline来检查overfitting or underfitting吗], k-fold on the validation set, tuning the model selected on the validation set
Model Validation: ROC_AUC, Recision-recall, Recall, precision, f1 on the testing set
Model Serving:online training? personalization? how often to update the model
还想知道所列出来的workflow里是否有没cover到的面试官想要考察的点?