I'm running the lemmatizer that's part of swe-pipeline on a very limited online host. It only gives me 500 Mb of RAM that I have to try to cram my NLP stuff into.
Here's a small test script that just loads the lemmatizer into memory and uses psutil to measure the memory used:
def memory_usage_psutil():
# return the memory usage in MB
import os
import psutil
process = psutil.Process(os.getpid())
mem = process.memory_info()[0] / float(2 ** 20)
return mem
if __name__ == '__main__':
print("Base memory usage: %.2f MB" % memory_usage_psutil())
import lemmatize
lemmatizer = lemmatize.SUCLemmatizer()
lemmatizer.load('swe-pipeline/suc-saldo.lemmas')
print("Lemmatize memory usage: %.2f MB" % memory_usage_psutil())
To run it, put it in the efselab root directory, install psutil with pip install psutil, and run it.
(efselab) ~/Projects/efselab $ python test.py
Base memory usage: 9.10 MB
Lemmatize memory usage: 492.65 MB