Skip to content

Enhance the CPU Power of PaddlePaddle #7561

@luotao1

Description

@luotao1

PaddlePaddle should enhance its CPU Power due to two reasons:

  • For company, the number of CPU Cloud Clusters they used are much larger and bigger than that of GPU.
  • For person, most laptops they used only have CPU processor.
  • Some workloads, such as RNN, sparse update etc, are more suitable for CPU.

Thus, combined with PaddlePaddle's 10 aspects in 2018, several things related with CPU Power should be considered:

  1. Fluid framework (Discussed with @jacquesqiao ): Add auto data format transform mechanism for different library and hardware. Add new fluid op with MKL-DNN/MKL library.
  2. Multi-device Multi-thread (Discussed with @jacquesqiao ): Use the full capacity of CPU when multi-threading on both MKL and OpenBLAS library.
  3. Distribution (Discussed with @typhoonzero ): Improve distributed training with LARS. Improve distributed training with LARS #6811
  4. Benchmarking (Discussed with @dzhwinter ): CPU benchmarking is a part of whole benchmarking. Need to provide both training and inference benchmarking among different Math libraries, and compared with other DL frameworks.
  5. Documents: Enhance the documents and select some models to tell users: 1) when and how to choose the most efficient Math Library, such as the usage range of MKL-DNN library; 2) how to use the full capacity of CPU.
  6. Training (Discussed with @wanghaoshuang ): If our documents and models are more perfect, it would be much easier for user training.
  7. NLP support (Discussed with @lcy-seso ): Enhance the CPU power of some NLP ops (mainly RNN/LSTM/GRU) based on specific workload.
  8. Speech support (Discussed with @kuke ): Enhance the CPU power of some Speech ops (mainly RNN/CNN) based on specific workload.
  9. Image support (Discussed with @qingqing01 ): Enhance the CPU power of some Image ops (mainly CNN and Detection) based on specific workload.
  10. Inference (Discussed with @Xreki ): Enhance the CPU power of some Inference ops' forward computation based on specific workload. Easy to deploy. No need to change or configure the inference API to use MKL-DNN related operators.

Hope any suggestions!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions