Design Status

This is a prototype.

Purpose

This is a tool to automatically monitor text for toxicity. It's powered by the Bert artificial intelligence language model that Google created.

This is a cloud based microservice. Text like forum comments and customer feedback can be automatically sent to the model for analysis. The model will respond by returning a toxic probability score.

API Endpoint: http://139.59.50.72/predict

Limitations

- The model's prediction accuracy has not been field tested. Also, it's prediction speed needs to be optimized.

- To improve reliability, this prototype is running on two load-balanced servers. Each has 2 vCPU's and 4GB of RAM. This backend setup has not been stress tested.

- This demo site will be live until mid May 2020.

Published Design

The design code and the step-by-step process used to fine tune the model has been published on Kaggle. You can find the open source notebook here.

Dataset Licence

The model was fine tuned using data made available during the Kaggle Jigsaw Multilingual Toxic Comment Classification compeition. The data is licenced for any use.