I’m trying to make a public facing web app that allows for inferencing, with probably ten or so available models to my users. My initial thought was that I would have a front-end basic webpage, that communicates with a REST API server on an EC2 instance. But since I started planning this out a bit more, I found a lot of info about various AWS products, and they seem interesting but it’s all pretty over my head.
I initially came the site because I heard about elastic inferencing. After I researched elastic inferencing more, it seems like Amazon is encouraging people to use Inferentia2 instead. I realize that I could just do an EC2 instance, but I don’t know how well that’ll work for scaling if this app I’m making becomes popular. I’ve also read a bit about SageMaker, API Gateway, and even “serverless” options like Lambda, but I don’t really know if those would integrate well with low cost inferencing products that AWS offers.
Any advice on setting this kind of thing up?
Read more here: Source link