Spectral tensor layers in distributed deep learning

This technology is a distributed deep learning method that uses a spectral tensor layer to form parallel branches, eliminating the need for communication overhead to build the overall network.

Unmet Need: Decrease in communication overhead for distributed deep learning model

Using distributed deep learning for large neural networks improves efficiency and reduces processing time. Typically, a Stochastic Gradient Descent (SGD) method is used. However, this approach requires multiple rounds of overhead communication between nodes to exchange gradients or models, increasing overall communication costs.

The Technology: Elimination of communication costs via spectral tensor layer

This technology is a distributed learning model that uses a spectral tensor network to eliminate communication overhead. The data is represented in tensor form, followed by a linear transformation that generates a series of parallel branches. Each parallel branch is trained independently, eliminating communication overhead, and then ensembled to form the overall network. This method offers a communication-free, reduced storage, and parallel speedup method for distributed deep learning.

This technology has been validated on multiple independent datasets.

Applications:

Large language models
Parallel and distributed implementations
Federation learning
Image classification

Advantages:

Elimination of overhead communication
Network compression
Computation reduction
Parallel speedup
High accuracy

Lead Inventor:

Xiaodong Wang, Ph.D.

Patent Information:

Patent Pending

Related Publications:

Liu X-Y, Wang X, Yuan B, Han J. “Spectral Tensor Layers for Communication-Free Distributed Deep Learning.” IEEE Trans Neural Netw Learn Syst. 2025 Apr 4; 36(4): 7237-7251.

Tech Ventures Reference:

IR CU24166
Licensing Contact: Greg Maskel