Name	Name	Last commit message	Last commit date
parent directory ..
doc/images	doc/images
source	source
README.md	README.md

Onnx ML in the browser

This code sample provides an end-to-end solution that manages the lifecycle of ML models deployed on a PC/mobile device.

A web application built and deployed with AWS Amplify provides user with a UI to analyze a picture using a ML model (Pytorch Vision MobileNet V2). This pre-trained model, available on SageMaker Jumpstart, is fine-tuned on a custom dataset. Once the prediction is done, inference metrics along with the image used are pushed to the cloud.

The inference is performed in JavaScript on device, in the browser using ONNX Runtime web. ONNX Runtime Web can run on both CPU and GPU. On CPU side, WebAssembly is adopted to execute the model at near-native speed. For performance acceleration with GPUs, ONNX Runtime Web leverages WebGL, a popular standard for accessing GPU capabilities.

There are benefits to doing on-device and in-browser inference:

Speed: inferencing is done on the client with models that are optimized to work on less powerful hardware
Privacy: Since the data never leaves the device for inferencing, it is a safer method of doing inferencing
Offline. If you lose internet connection, the model will still be able to perform inference
Cost: You can reduce cloud serving costs by offloading inference to the browser

This code sample sends inference metrics and the input image back to the cloud. You can disable this feature in case of privacy/offline scenarios needs. To do so, comment lines 134 and 135 in the frontend

See the following compatibility list for supported operating systems and browsers.

Solution architecture

The idea is to use Amazon SageMaker to fine tune an existing pre-trained model using a custom dataset, export the model to the ONNX format, and deploy it on a device. Each user accesses the web application hosted in AWS Amplify through his laptop/mobile device web browser, takes a picture and using the ML model, performs image classification. The web application collects some metrics from the predictions as well as the input image used for inference, and sends them to an Amazon API Gateway endpoint. AWS Lambda functions process the data and ingests it to Amazon Cloudwatch logs and an Amazon Simple Storage Service bucket. This data can then be visualized using an Amazon Cloudwatch dashboard.

A developer pushes some changes to the code repository containing the application code
Through Amplify, a build is triggered and once successfull, the application is deployed
User can access the application from his device using a web browser
A data scientist uses SageMaker Studio to fine tune a pre-trained ML model from Amazon SageMaker Jumpstart
The model is registered in the SageMaker Model Registry with a new version, awaiting approval
A QA engineer validates manually the model version
When a model is approved, an Amazon EventBridge rule triggers a new deployment
Model is exported to the ONNX format and/or optimized for the target
Model is stored in an Amazon Simple Storage Service (S3) bucket
The user on his mobile device uses the application to authenticate to the cloud through Amazon Cognito
Application sends a request through an Amazon API Gateway endpoint endpoint to verify if a new version of a model is available. If yes, a presigned S3 url is generated. Authenticated through cognito, the application downloads the model from S3 to a local directory
The model is unpacked and the application loads a new ONNX runtime inference session with the new model
User takes a picture using his mobile device camera and loads the image through the mobile app
The application runs a prediction based on the acquired image
Application logs are captured and published to an API gateway endpoint. Input images are also uploaded to an S3 bucket
A Lambda function parses the application logs and parsed data are ingested to Amazon Cloudwatch logs
A data scientist can access the Cloudwatch dashboard and visualize information about the prediction, as well as the sotrage path of the input images from S3

Getting started

Backend

Follow the instructions in the dedicated README to deploy the backend stack

Front-end

Once the backend is deployed, follow the instructions in the dedicated README to deploy the frontend stack

Content Security Legal Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Operational Metrics Collection

This solution collects anonymous operational metrics to help AWS improve the quality and features of the solution. Data collection is subject to the AWS Privacy Policy (https://aws.amazon.com/privacy/). To opt out of this feature, simply remove the tag(s) starting with “uksb-” or “SO” from the description(s) in any CloudFormation templates or CDK TemplateOptions (app.py).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

onnx_accelerator_sample2

onnx_accelerator_sample2

README.md

Onnx ML in the browser

Solution architecture

Getting started

Backend

Front-end

Content Security Legal Disclaimer

Operational Metrics Collection

Files

onnx_accelerator_sample2

Directory actions

More options

Directory actions

More options

Latest commit

History

onnx_accelerator_sample2

Folders and files

parent directory

README.md

Onnx ML in the browser

Solution architecture

Getting started

Backend

Front-end

Content Security Legal Disclaimer

Operational Metrics Collection