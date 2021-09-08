Let’s say your working with multiple models and want to pick one to invoke based off of the use case of your application. Bring in SageMaker Multi-Model Endpoints as your scalable, cost-efficient solution. With SageMaker Multi-Model Endpoints (MME) you can bring thousands of models to one endpoint and specify which model you want to invoke per your use case. The main constraint with this inference option is that the models all need to be in the same framework, so all TensorFlow or all PyTorch not a mixture of both. If desiring a combination of numerous frameworks you will want to check out SageMaker Multi-Container Endpoints. For this article, we’ll walk through an example in which we bring two custom TensorFlow models for simplicity’s sake. We’ll walk through the end to end example and see how each different model can be invoked or defined by a simple Boto3 API call. Before getting started, please make sure to read the Prerequisites/Setup section as there’s a decent amount of AWS & ML knowledge necessary to fully understand this demonstration. If you would just like to grab the code check out the following link.