How do you ensure horizontal and vertical scalability?
Screena uses multiple threads by request. The response time per request (latency) is based on the server resources on which Screena is installed. To reduce latency, we can increase the number of CPUs. For higher volumes, we can also use GPU processing.
To increase the number of transactions processed per second (throughput), we can increase the number of VMs behind the load balancer (auto-scaling group in AWS). This means that the number of requests per second can be easily augmented without impacting the average response time.