Skip to content

Conversation

@driade
Copy link
Contributor

@driade driade commented May 25, 2017

Unsetting the variable frees the memory through the loop. Really useful for large datasets.

I've run the following code as a test for the issue (PHP 5.6).

User::truncate();

foreach (range(1, 100000) as $i) {
    User::create([
        'email'    => uniqid(),
        'name'     => 'foo',
        'password' => 'foo',
    ]);
}

User::chunk(5000, function ($user) {

    echo round(memory_get_peak_usage() / 1024 / 1024) . "\n";

});

Peak without commit: 39
Peak with commit: 29

If I'm right I think there could be more candidates to optimize. Maybe it's just a micro-optimization as this memory is just wasted during a small period of time in the loop.

Thanks.

@vlakoff
Copy link
Contributor

vlakoff commented May 30, 2017

As the variable is reassigned, wouldn't just letting the GC do its job achieve better execution time?

I mean, the memory is not wasted if there is enough memory available, it's here to be used and reduce collection cycles.

@driade
Copy link
Contributor Author

driade commented May 30, 2017

As far as I understand there is a moment in the life of the function where two instances of the collection $results exists, after the first loop is ending and the next one starts, and subsequents, before the reassignment.

You can try the example or any other with the same structure, with big collections / variables which would make difference appear clearly.

I've found this as an example http://blog.preinheimer.com/index.php?/archives/354-Memory-usage-in-PHP.html

@vlakoff
Copy link
Contributor

vlakoff commented May 30, 2017

Yes I noted that too, just before $results is reassigned, both result sets are reachable so the peak size is doubled. I guess this does make a difference in some situations, including yours, so the code change seems to be justified.

Thanks for the article link, I got inspired by the sentence "This isn’t critical, except when it is."

@driade
Copy link
Contributor Author

driade commented May 30, 2017

I'd add a note about why this is justified. In other situations it could be a micro optimization, but in these case this function is clearly expected to manage big memory... because that's the reason why someone would use "chunk"s ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants