The this article discusses the details in training/test datasets and related meta data, sample/scoring datasets and related meta data, and model meta data in deployment.
Example (a few record from the purchase dataset): The target variable is Purchased.
User ID | Gender | Age | EstimatedSalary | Purchased |
15624510 | Male | 19 | 19000 | 0 |
15810944 | Male | 35 | 20000 | 0 |
15668575 | Female | 26 | 43000 | 0 |
15603246 | Female | 27 | 57000 | 0 |
15804002 | Male | 19 | 76000 | 0 |
15728773 | Male | 27 | 58000 | 0 |
It is important that fields in the model meta data must match the variables passed to the Python function adapter in order for model consumption to work
Setting 1
- Fields in the training/test data sets (no need for meta data): User ID, Gender, Age, EstimatedSalary, Purchased
- The training/test datasets are used in the training script/notebook and may have all the variables/fields
- Fields in the sample data set and related meta data: User ID, Gender, Age, EstimatedSalary
- No target variable, but may contain the rest
- Fields in the scoring data set and related meta data: User ID, Gender, Age, EstimatedSalary
- No target variable, but may contain the rest
- Fields in the model meta data: Age, EstimatedSalary, Purchased
- Only contains fields/variables in the model
- May need to manually remove unnecessary fields from the generated meta data
- Fields passed to the Python function adapter in the report: Age, EstimatedSalary, Purchased
- Note: Only contains fields/variables in the model
Setting 2
- Fields in the training/test data sets (no need for meta data): User ID, Gender, Age, EstimatedSalary, Purchased
- Note: The training/test datasets are used in the training script/notebook and may have all the variables/fields
- Fields in the sample data set and related meta data: Age, EstimatedSalary
- Note: No target variable, and contains only variables in the model
- Fields in the scoring data set and related meta data: Age, EstimatedSalary
- Note: No target variable, and contains only variables in the model
- Fields in the model meta data: Age, EstimatedSalary, Purchased
- Note: Only contains fields/variables in the model
- Fields passed to the Python function adapter in the report: Age, EstimatedSalary, Purchased
- Note: Only contains fields/variables in the model