# How do you ensure horizontal and vertical scalability?

Screena uses multiple threads by request. The response time per request (**latency**) is based on the server resources on which Screena is installed. To reduce latency, we can increase the **number of CPUs**. For higher volumes, we can also use **GPU processing**.

To increase the number of transactions processed per second (**throughput**), we can increase the **number of VMs** behind the load balancer (**auto-scaling** group in AWS). This means that the number of requests per second can be easily augmented without impacting the average response time.
