0%

DeepXplore

SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks

Problem

Memory optimization of single DNN training work in single GPU

Method

  • Liveness Analysis 只保存未来会用到的tensors
  • Unified Tensor Pool communication tools for GPU-CPU memory using Least Recent Used tensor replacement policy and a Tensor cache. (只有conv 这种co mputation-intensive的才放到cpu去)
  • Cost-Aware Re-computation 只要不高于peak memory 都不recompute,否则除了conv 都recompute(pool,activation…)
  • Selective convolution algorithm 基于此刻与peak memory的差值选择convolution 方法(eg。 FFT需要额外的convolution space)