728640 bytes of page_cgroup At the same time, system performance for various workloads is unaffected: 100G sparse file cat, 4G physical memory, 10 runs, to test for code bloat in the traditional LRU handling and kswapd & direct reclaim paths, without/with the memory controller configured in vanilla: 71.603(0.207) seconds patched: 71.640(0.156) seconds vanilla: 79.558(0.288) seconds patched: 77.233(0.147) seconds 100G sparse file cat in 1G memory cgroup, 10 runs, to test for code bloat in the traditional memory cgroup LRU handling and reclaim path vanilla: 96.844(0.281) seconds patched: 94.454(0.311) seconds 4 unlimited memcgs running kbuild -j32 each, 4G physical memory, 500M swap on SSD, 10 runs, to test for regressions in kswapd & direct reclaim using per-memcg LRU lists with multiple memcgs and multiple allocators within each memcg vanilla: 717.722(1.440) seconds [ 69720.100(11600.835) majfaults ] patched: 714.106(2.313) seconds [ 71109.300(14886.186) majfaults ] 16 unlimited memcgs running kbuild, 1900M hierarchical limit, 500M swap on SSD, 10 runs, to test for regressions in hierarchical memcg setups vanilla: 2742.058(1.992) seconds [ 26479.600(1736.737) majfaults ] patched: 2743.267(1.214) seconds [ 27240.700(1076.063) majfaults ] This patch: There are currently two different implementations of iterating over a memory cgroup hierarchy tree. Consolidate them into one worker function and base the convenience looping-macros on top of it. Signed-off-by: Johannes Weiner Reviewed-by: KAMEZAWA Hiroyuki Reviewed-by: Michal Hocko Reviewed-by: Kirill A. Shutemov Cc: Daisuke Nishimura Cc: Balbir Singh Cc: Ying Han Cc: Greg Thelen Cc: Michel Lespinasse Cc: Rik van Riel Cc: Minchan Kim Cc: Christoph Hellwig Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds µ†fÈ•ˆ