Skip to content

ANN

Achieving Scalable Performance on Commodity Hardware for Large Model Training

Training large-scale artificial intelligence models has become a defining challenge of the modern computational era, often perceived as a domain exclusive to those with access to state-of-the-art, unencumbered supercomputing infrastructure. However, recent advancements have demonstrated that exceptional performance can be achieved even with resource constraints and heterogeneous or restricted hardware. This is accomplished through a sophisticated synthesis of software optimization, algorithmic innovation, and a deep understanding of the underlying system architecture. This post delves into the technical strategies that enable high-throughput training, focusing on innovations in parallelism, communication, and low-precision arithmetic.

Loss Landscape

Very nice WebGL application to visualize the loss landscape for some common ANN. Currently features the models Resnet-20 (short/no-short), Resnet-56 (short/no-short), Vgg 16 and DenseNet 121.

http://www.telesens.co/loss-landscape-viz/viewer.html

More info: