Spectral tensor layers in distributed deep learning

This technology is a distributed deep learning method that uses a spectral tensor layer to form parallel branches, eliminating the need for communication overhead to build the overall network.

Unmet Need: Decrease in communication overhead for distributed deep learning model

Using distributed deep learning for large neural networks improves efficiency and reduces processing time. Typically, a Stochastic Gradient Descent (SGD) method is used. However, this approach requires multiple rounds of overhead communication between nodes to exchange gradients or models, increasing overall communication costs.

The Technology: Elimination of communication costs via spectral tensor layer

This technology is a distributed learning model that uses a spectral tensor network to eliminate communication overhead. The data is represented in tensor form, followed by a linear transformation that generates a series of parallel branches. Each parallel branch is trained independently, eliminating communication overhead, and then ensembled to form the overall network. This method offers a communication-free, reduced storage, and parallel speedup method for distributed deep learning.

This technology has been validated on multiple independent datasets.

Applications:

  • Large language models
  • Parallel and distributed implementations
  • Federation learning
  • Image classification

Advantages:

  • Elimination of overhead communication
  • Network compression
  • Computation reduction
  • Parallel speedup
  • High accuracy

Lead Inventor:

Xiaodong Wang, Ph.D.

Patent Information:

Patent Pending

Related Publications:

Tech Ventures Reference:

Quick Facts:
Tags
Data communicationDeep learningLinear mapSpeedupStochastic gradient descentTensor
Inventors
Xiaodong WangXiaoyang Liu
Manager
Greg Maskel
Departments
Electrical Engineering
Divisions
Fu Foundation School of Engineering and Applied Science (SEAS)
Reference Number
CU24166
Release Date
2026-03-06