Spectral tensor layers in distributed deep learning
This technology is a distributed deep learning method that uses a spectral tensor layer to form parallel branches, eliminating the need for communication overhead to build the overall network.
Unmet Need: Decrease in communication overhead for distributed deep learning model
Using distributed deep learning for large neural networks improves efficiency and reduces processing time. Typically, a Stochastic Gradient Descent (SGD) method is used. However, this approach requires multiple rounds of overhead communication between nodes to exchange gradients or models, increasing overall communication costs.
The Technology: Elimination of communication costs via spectral tensor layer
This technology is a distributed learning model that uses a spectral tensor network to eliminate communication overhead. The data is represented in tensor form, followed by a linear transformation that generates a series of parallel branches. Each parallel branch is trained independently, eliminating communication overhead, and then ensembled to form the overall network. This method offers a communication-free, reduced storage, and parallel speedup method for distributed deep learning.
This technology has been validated on multiple independent datasets.
Applications:
- Large language models
- Parallel and distributed implementations
- Federation learning
- Image classification
Advantages:
- Elimination of overhead communication
- Network compression
- Computation reduction
- Parallel speedup
- High accuracy
Lead Inventor:
Patent Information:
Patent Pending
Related Publications:
Tech Ventures Reference:
IR CU24166
Licensing Contact: Greg Maskel
