This is an issue of NLP online service. When run inference, the memory usage is always kept as about 6G, which is definitely larger than actually needed. 
This is an issue of NLP online service.
When run inference, the memory usage is always kept as about 6G, which is definitely larger than actually needed.