[PATCH 0/4] (CMA_AGGRESSIVE) Make CMA memory be more aggressive about allocation

Discussion:

[PATCH 0/4] (CMA_AGGRESSIVE) Make CMA memory be more aggressive about allocation

Hui Zhu

2014-10-16 03:35:47 UTC

In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.

If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.

This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Hui Zhu

2014-10-16 03:35:48 UTC

This post might be inappropriate. Click to display it.

Pavel Machek

2014-10-18 22:15:25 UTC

Hi!

Post by Hui Zhu
Add CMA_AGGRESSIVE config that depend on CMA to Linux kernel config.
Add CMA_AGGRESSIVE_PHY_MAX, CMA_AGGRESSIVE_FREE_MIN and CMA_AGGRESSIVE_SHRINK
that depend on CMA_AGGRESSIVE.
If physical memory size (not include CMA memory) in byte less than or equal to
CMA_AGGRESSIVE_PHY_MAX, CMA aggressive switch (sysctl vm.cma-aggressive-switch)
will be opened.

Ok...

Do I understand it correctly that there is some problem with
hibernation not working on machines not working on machines with big
CMA areas...?

But adding 4 config options end-user has no chance to set right can
not be the best solution, can it?

Post by Hui Zhu
+config CMA_AGGRESSIVE_PHY_MAX
+ hex "Physical memory size in Bytes that auto turn on the CMA aggressive switch"
+ depends on CMA_AGGRESSIVE
+ default 0x40000000
+ help
+ If physical memory size (not include CMA memory) in byte less than or
+ equal to this value, CMA aggressive switch will be opened.
+ After the Linux boot, sysctl "vm.cma-aggressive-switch" can control
+ the CMA AGGRESSIVE switch.

For example... how am I expected to figure right value to place here?

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

朱辉

2014-10-22 05:44:24 UTC

Post by Pavel Machek
Hi!

Post by Hui Zhu
Add CMA_AGGRESSIVE config that depend on CMA to Linux kernel config.
Add CMA_AGGRESSIVE_PHY_MAX, CMA_AGGRESSIVE_FREE_MIN and CMA_AGGRESSIVE_SHRINK
that depend on CMA_AGGRESSIVE.
If physical memory size (not include CMA memory) in byte less than or equal to
CMA_AGGRESSIVE_PHY_MAX, CMA aggressive switch (sysctl vm.cma-aggressive-switch)
will be opened.

Ok...
Do I understand it correctly that there is some problem with
hibernation not working on machines not working on machines with big
CMA areas...?

No, these patches want to handle this issue that most of CMA memory is
not allocated before lowmemorykiller or oom_killer begin to kill tasks.

Post by Pavel Machek
But adding 4 config options end-user has no chance to set right can
not be the best solution, can it?

Post by Hui Zhu
+config CMA_AGGRESSIVE_PHY_MAX
+ hex "Physical memory size in Bytes that auto turn on the CMA aggressive switch"
+ depends on CMA_AGGRESSIVE
+ default 0x40000000
+ help
+ If physical memory size (not include CMA memory) in byte less than or
+ equal to this value, CMA aggressive switch will be opened.
+ After the Linux boot, sysctl "vm.cma-aggressive-switch" can control
+ the CMA AGGRESSIVE switch.

For example... how am I expected to figure right value to place here?

I agree with that. I will update this config to auto set in next version.

Thanks,
Hui

Post by Pavel Machek
Pavel

��{.n�+��zwZ��,j��n�˛��m�b��f�)��w+h��&�K�rJ+�Z+��ފw��r��L2Ǟ��i��0�X��?

Hui Zhu

2014-10-16 03:35:49 UTC

Function shrink_all_memory try to free `nr_to_reclaim' of memory.
CMA_AGGRESSIVE_SHRINK function will call this functon to free `nr_to_reclaim' of
memory. It need different scan_control with current caller function
hibernate_preallocate_memory.

If hibernation is true, the caller is hibernate_preallocate_memory.
if not, the caller is CMA alloc function.

Signed-off-by: Hui Zhu <***@xiaomi.com>
---
include/linux/swap.h | 3 ++-
kernel/power/snapshot.c | 2 +-
mm/vmscan.c | 19 +++++++++++++------
3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 37a585b..9f2cb43 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -335,7 +335,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
struct zone *zone,
unsigned long *nr_scanned);
-extern unsigned long shrink_all_memory(unsigned long nr_pages);
+extern unsigned long shrink_all_memory(unsigned long nr_pages,
+ bool hibernation);
extern int vm_swappiness;
extern int remove_mapping(struct address_space *mapping, struct page *page);
extern unsigned long vm_total_pages;
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 791a618..a00fc35 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1657,7 +1657,7 @@ int hibernate_preallocate_memory(void)
* NOTE: If this is not done, performance will be hurt badly in some
* test cases.
*/
- shrink_all_memory(saveable - size);
+ shrink_all_memory(saveable - size, true);

/*
* The number of saveable pages in memory was too high, so apply some
diff --git a/mm/vmscan.c b/mm/vmscan.c
index dcb4707..fdcfa30 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,7 +3404,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
wake_up_interruptible(&pgdat->kswapd_wait);
}

-#ifdef CONFIG_HIBERNATION
+#if defined CONFIG_HIBERNATION || defined CONFIG_CMA_AGGRESSIVE
/*
* Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
* freed pages.
@@ -3413,22 +3413,29 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
* LRU order by reclaiming preferentially
* inactive > active > active referenced > active mapped
*/
-unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim, bool hibernation)
{
struct reclaim_state reclaim_state;
struct scan_control sc = {
.nr_to_reclaim = nr_to_reclaim,
- .gfp_mask = GFP_HIGHUSER_MOVABLE,
.priority = DEF_PRIORITY,
- .may_writepage = 1,
.may_unmap = 1,
.may_swap = 1,
- .hibernation_mode = 1,
};
struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
struct task_struct *p = current;
unsigned long nr_reclaimed;

+ if (hibernation) {
+ sc.hibernation_mode = 1;
+ sc.may_writepage = 1;
+ sc.gfp_mask = GFP_HIGHUSER_MOVABLE;
+ } else {
+ sc.hibernation_mode = 0;
+ sc.may_writepage = !laptop_mode;
+ sc.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_HIGHMEM;
+ }
+
p->flags |= PF_MEMALLOC;
lockdep_set_current_reclaim_state(sc.gfp_mask);
reclaim_state.reclaimed_slab = 0;
@@ -3442,7 +3449,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)

return nr_reclaimed;
}
-#endif /* CONFIG_HIBERNATION */
+#endif /* CONFIG_HIBERNATION || CONFIG_CMA_AGGRESSIVE */

/* It's optimal to keep kswapds on the same CPUs as their memory, but
not required for correctness. So if the last cpu in a node goes

--
1.9.1

Rafael J. Wysocki

2014-10-16 08:45:21 UTC

[CC list trimmed]

Post by Hui Zhu
Function shrink_all_memory try to free `nr_to_reclaim' of memory.
CMA_AGGRESSIVE_SHRINK function will call this functon to free `nr_to_reclaim' of
memory. It need different scan_control with current caller function
hibernate_preallocate_memory.
If hibernation is true, the caller is hibernate_preallocate_memory.
if not, the caller is CMA alloc function.
---
include/linux/swap.h | 3 ++-
kernel/power/snapshot.c | 2 +-
mm/vmscan.c | 19 +++++++++++++------
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 37a585b..9f2cb43 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -335,7 +335,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
struct zone *zone,
unsigned long *nr_scanned);
-extern unsigned long shrink_all_memory(unsigned long nr_pages);
+extern unsigned long shrink_all_memory(unsigned long nr_pages,
+ bool hibernation);
extern int vm_swappiness;
extern int remove_mapping(struct address_space *mapping, struct page *page);
extern unsigned long vm_total_pages;
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 791a618..a00fc35 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1657,7 +1657,7 @@ int hibernate_preallocate_memory(void)
* NOTE: If this is not done, performance will be hurt badly in some
* test cases.
*/
- shrink_all_memory(saveable - size);
+ shrink_all_memory(saveable - size, true);

Instead of doing this, can you please define

__shrink_all_memory()

that will take the appropriate struct scan_control as an argument and
then define two wrappers around that, one for hibernation and one for CMA?

The way you did it opens a field for bugs caused by passing a wrong value
as the second argument.

Post by Hui Zhu
/*
* The number of saveable pages in memory was too high, so apply some
diff --git a/mm/vmscan.c b/mm/vmscan.c
index dcb4707..fdcfa30 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,7 +3404,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
wake_up_interruptible(&pgdat->kswapd_wait);
}
-#ifdef CONFIG_HIBERNATION
+#if defined CONFIG_HIBERNATION || defined CONFIG_CMA_AGGRESSIVE
/*
* Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
* freed pages.
@@ -3413,22 +3413,29 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
* LRU order by reclaiming preferentially
* inactive > active > active referenced > active mapped
*/
-unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim, bool hibernation)
{
struct reclaim_state reclaim_state;
struct scan_control sc = {
.nr_to_reclaim = nr_to_reclaim,
- .gfp_mask = GFP_HIGHUSER_MOVABLE,
.priority = DEF_PRIORITY,
- .may_writepage = 1,
.may_unmap = 1,
.may_swap = 1,
- .hibernation_mode = 1,
};
struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
struct task_struct *p = current;
unsigned long nr_reclaimed;
+ if (hibernation) {
+ sc.hibernation_mode = 1;
+ sc.may_writepage = 1;
+ sc.gfp_mask = GFP_HIGHUSER_MOVABLE;
+ } else {
+ sc.hibernation_mode = 0;
+ sc.may_writepage = !laptop_mode;
+ sc.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_HIGHMEM;
+ }
+
p->flags |= PF_MEMALLOC;
lockdep_set_current_reclaim_state(sc.gfp_mask);
reclaim_state.reclaimed_slab = 0;
@@ -3442,7 +3449,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
return nr_reclaimed;
}
-#endif /* CONFIG_HIBERNATION */
+#endif /* CONFIG_HIBERNATION || CONFIG_CMA_AGGRESSIVE */
/* It's optimal to keep kswapds on the same CPUs as their memory, but
not required for correctness. So if the last cpu in a node goes

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

朱辉

2014-10-17 06:18:39 UTC

Post by Rafael J. Wysocki
[CC list trimmed]

Post by Hui Zhu
Function shrink_all_memory try to free `nr_to_reclaim' of memory.
CMA_AGGRESSIVE_SHRINK function will call this functon to free `nr_to_reclaim' of
memory. It need different scan_control with current caller function
hibernate_preallocate_memory.
If hibernation is true, the caller is hibernate_preallocate_memory.
if not, the caller is CMA alloc function.
---
include/linux/swap.h | 3 ++-
kernel/power/snapshot.c | 2 +-
mm/vmscan.c | 19 +++++++++++++------
3 files changed, 16 insertions(+), 8 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 37a585b..9f2cb43 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -335,7 +335,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
gfp_t gfp_mask, bool noswap,
struct zone *zone,
unsigned long *nr_scanned);
-extern unsigned long shrink_all_memory(unsigned long nr_pages);
+extern unsigned long shrink_all_memory(unsigned long nr_pages,
+ bool hibernation);
extern int vm_swappiness;
extern int remove_mapping(struct address_space *mapping, struct page *page);
extern unsigned long vm_total_pages;
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 791a618..a00fc35 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1657,7 +1657,7 @@ int hibernate_preallocate_memory(void)
* NOTE: If this is not done, performance will be hurt badly in some
* test cases.
*/
- shrink_all_memory(saveable - size);
+ shrink_all_memory(saveable - size, true);

Instead of doing this, can you please define
__shrink_all_memory()
that will take the appropriate struct scan_control as an argument and
then define two wrappers around that, one for hibernation and one for CMA?
The way you did it opens a field for bugs caused by passing a wrong value
as the second argument.

Thanks Rafael.
I will update patch according to your comments.

Best,
Hui

Post by Rafael J. Wysocki

Post by Hui Zhu
/*
* The number of saveable pages in memory was too high, so apply some
diff --git a/mm/vmscan.c b/mm/vmscan.c
index dcb4707..fdcfa30 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,7 +3404,7 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
wake_up_interruptible(&pgdat->kswapd_wait);
}
-#ifdef CONFIG_HIBERNATION
+#if defined CONFIG_HIBERNATION || defined CONFIG_CMA_AGGRESSIVE
/*
* Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
* freed pages.
@@ -3413,22 +3413,29 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
* LRU order by reclaiming preferentially
* inactive > active > active referenced > active mapped
*/
-unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim, bool hibernation)
{
struct reclaim_state reclaim_state;
struct scan_control sc = {
.nr_to_reclaim = nr_to_reclaim,
- .gfp_mask = GFP_HIGHUSER_MOVABLE,
.priority = DEF_PRIORITY,
- .may_writepage = 1,
.may_unmap = 1,
.may_swap = 1,
- .hibernation_mode = 1,
};
struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
struct task_struct *p = current;
unsigned long nr_reclaimed;
+ if (hibernation) {
+ sc.hibernation_mode = 1;
+ sc.may_writepage = 1;
+ sc.gfp_mask = GFP_HIGHUSER_MOVABLE;
+ } else {
+ sc.hibernation_mode = 0;
+ sc.may_writepage = !laptop_mode;
+ sc.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_HIGHMEM;
+ }
+
p->flags |= PF_MEMALLOC;
lockdep_set_current_reclaim_state(sc.gfp_mask);
reclaim_state.reclaimed_slab = 0;
@@ -3442,7 +3449,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
return nr_reclaimed;
}
-#endif /* CONFIG_HIBERNATION */
+#endif /* CONFIG_HIBERNATION || CONFIG_CMA_AGGRESSIVE */
/* It's optimal to keep kswapds on the same CPUs as their memory, but
not required for correctness. So if the last cpu in a node goes

��칻�&�~�&��+-��ݶ��w��˛��m�b��i�)��w*jg��ݢj/��z�ޖ��

Hui Zhu

2014-10-17 09:28:04 UTC

Update this patch according to the comments from Rafael.

Function shrink_all_memory_for_cma try to free `nr_to_reclaim' of memory.
CMA aggressive shrink function will call this functon to free `nr_to_reclaim' of
memory.

Signed-off-by: Hui Zhu <***@xiaomi.com>
---
mm/vmscan.c | 58 +++++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index dcb4707..658dc8d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,6 +3404,28 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
wake_up_interruptible(&pgdat->kswapd_wait);
}

+#if defined CONFIG_HIBERNATION || defined CONFIG_CMA_AGGRESSIVE
+static unsigned long __shrink_all_memory(struct scan_control *sc)
+{
+ struct reclaim_state reclaim_state;
+ struct zonelist *zonelist = node_zonelist(numa_node_id(), sc->gfp_mask);
+ struct task_struct *p = current;
+ unsigned long nr_reclaimed;
+
+ p->flags |= PF_MEMALLOC;
+ lockdep_set_current_reclaim_state(sc->gfp_mask);
+ reclaim_state.reclaimed_slab = 0;
+ p->reclaim_state = &reclaim_state;
+
+ nr_reclaimed = do_try_to_free_pages(zonelist, sc);
+
+ p->reclaim_state = NULL;
+ lockdep_clear_current_reclaim_state();
+ p->flags &= ~PF_MEMALLOC;
+
+ return nr_reclaimed;
+}
+
#ifdef CONFIG_HIBERNATION
/*
* Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
@@ -3415,7 +3437,6 @@ void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx)
*/
unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
{
- struct reclaim_state reclaim_state;
struct scan_control sc = {
.nr_to_reclaim = nr_to_reclaim,
.gfp_mask = GFP_HIGHUSER_MOVABLE,
@@ -3425,24 +3446,31 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
.may_swap = 1,
.hibernation_mode = 1,
};
- struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
- struct task_struct *p = current;
- unsigned long nr_reclaimed;
-
- p->flags |= PF_MEMALLOC;
- lockdep_set_current_reclaim_state(sc.gfp_mask);
- reclaim_state.reclaimed_slab = 0;
- p->reclaim_state = &reclaim_state;

- nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
+ return __shrink_all_memory(&sc);
+}
+#endif /* CONFIG_HIBERNATION */

- p->reclaim_state = NULL;
- lockdep_clear_current_reclaim_state();
- p->flags &= ~PF_MEMALLOC;
+#ifdef CONFIG_CMA_AGGRESSIVE
+/*
+ * Try to free `nr_to_reclaim' of memory, system-wide, for CMA aggressive
+ * shrink function.
+ */
+void shrink_all_memory_for_cma(unsigned long nr_to_reclaim)
+{
+ struct scan_control sc = {
+ .nr_to_reclaim = nr_to_reclaim,
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_HIGHMEM,
+ .priority = DEF_PRIORITY,
+ .may_writepage = !laptop_mode,
+ .may_unmap = 1,
+ .may_swap = 1,
+ };

- return nr_reclaimed;
+ __shrink_all_memory(&sc);
}
-#endif /* CONFIG_HIBERNATION */
+#endif /* CONFIG_CMA_AGGRESSIVE */
+#endif /* CONFIG_HIBERNATION || CONFIG_CMA_AGGRESSIVE */

/* It's optimal to keep kswapds on the same CPUs as their memory, but
not required for correctness. So if the last cpu in a node goes

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

PINTU KUMAR

2014-10-18 04:50:08 UTC

Hi,

----- Original Message -----

Sent: Friday, 17 October 2014 2:58 PM
Subject: [PATCH v2 2/4] (CMA_AGGRESSIVE) Add new function shrink_all_memory_for_cma
Update this patch according to the comments from Rafael.
Function shrink_all_memory_for_cma try to free `nr_to_reclaim' of memory.
CMA aggressive shrink function will call this functon to free
`nr_to_reclaim' of
memory.

Instead, we can have in short shrink_cma_memory(nr_to_reclaim).
Sometime back I already proposed to have shrink_memory for CMA here:
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-January/143103.html

Now, I am working on another solution that uses shrink_all_memory().
This can be helpful even for non CMA cases as well to bring back the higher-order pages quickly.
Will post the patches until next week.

---
mm/vmscan.c | 58 +++++++++++++++++++++++++++++++++++++++++++---------------
1 file changed, 43 insertions(+), 15 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index dcb4707..658dc8d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,6 +3404,28 @@ void wakeup_kswapd(struct zone *zone, int order, enum
zone_type classzone_idx)
wake_up_interruptible(&pgdat->kswapd_wait);
}
+#if defined CONFIG_HIBERNATION || defined CONFIG_CMA_AGGRESSIVE
+static unsigned long __shrink_all_memory(struct scan_control *sc)
+{
+ struct reclaim_state reclaim_state;
+ struct zonelist *zonelist = node_zonelist(numa_node_id(), sc->gfp_mask);
+ struct task_struct *p = current;
+ unsigned long nr_reclaimed;
+
+ p->flags |= PF_MEMALLOC;
+ lockdep_set_current_reclaim_state(sc->gfp_mask);
+ reclaim_state.reclaimed_slab = 0;
+ p->reclaim_state = &reclaim_state;
+
+ nr_reclaimed = do_try_to_free_pages(zonelist, sc);
+
+ p->reclaim_state = NULL;
+ lockdep_clear_current_reclaim_state();
+ p->flags &= ~PF_MEMALLOC;
+
+ return nr_reclaimed;
+}
+
#ifdef CONFIG_HIBERNATION
/*
* Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
@@ -3415,7 +3437,6 @@ void wakeup_kswapd(struct zone *zone, int order, enum
zone_type classzone_idx)
*/
unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
{
- struct reclaim_state reclaim_state;
struct scan_control sc = {
.nr_to_reclaim = nr_to_reclaim,
.gfp_mask = GFP_HIGHUSER_MOVABLE,
@@ -3425,24 +3446,31 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
.may_swap = 1,
.hibernation_mode = 1,
};
- struct zonelist *zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
- struct task_struct *p = current;
- unsigned long nr_reclaimed;
-
- p->flags |= PF_MEMALLOC;
- lockdep_set_current_reclaim_state(sc.gfp_mask);
- reclaim_state.reclaimed_slab = 0;
- p->reclaim_state = &reclaim_state;
- nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
+ return __shrink_all_memory(&sc);
+}
+#endif /* CONFIG_HIBERNATION */
- p->reclaim_state = NULL;
- lockdep_clear_current_reclaim_state();
- p->flags &= ~PF_MEMALLOC;
+#ifdef CONFIG_CMA_AGGRESSIVE
+/*
+ * Try to free `nr_to_reclaim' of memory, system-wide, for CMA aggressive
+ * shrink function.
+ */
+void shrink_all_memory_for_cma(unsigned long nr_to_reclaim)
+{
+ struct scan_control sc = {
+ .nr_to_reclaim = nr_to_reclaim,
+ .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_HIGHMEM,
+ .priority = DEF_PRIORITY,
+ .may_writepage = !laptop_mode,
+ .may_unmap = 1,
+ .may_swap = 1,
+ };
- return nr_reclaimed;
+ __shrink_all_memory(&sc);
}
-#endif /* CONFIG_HIBERNATION */
+#endif /* CONFIG_CMA_AGGRESSIVE */
+#endif /* CONFIG_HIBERNATION || CONFIG_CMA_AGGRESSIVE */
/* It's optimal to keep kswapds on the same CPUs as their memory, but
not required for correctness. So if the last cpu in a node goes
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
see: http://www.linux-mm.org/ .

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Hui Zhu

2014-10-16 03:35:50 UTC

Add cma_alloc_counter, cma_aggressive_switch, cma_aggressive_free_min and
cma_aggressive_shrink_switch.

cma_aggressive_switch is the swith for all CMA_AGGRESSIVE function. It can be
controlled by sysctl vm.cma-aggressive-switch.

cma_aggressive_free_min can be controlled by sysctl
"vm.cma-aggressive-free-min". If the number of CMA free pages is small than
this sysctl value, CMA_AGGRESSIVE will not work in page alloc code.

cma_aggressive_shrink_switch can be controlled by sysctl
"vm.cma-aggressive-shrink-switch". If sysctl "vm.cma-aggressive-shrink-switch"
is true and free normal memory's size is smaller than the size that it want to
allocate, do memory shrink with function shrink_all_memory before driver
allocate pages from CMA.

When Linux kernel try to reserve custom contiguous area, increase the value of
cma_alloc_counter. CMA_AGGRESSIVE will not work in page alloc code.
After reserve custom contiguous area function return, decreases the value of
cma_alloc_counter.

Signed-off-by: Hui Zhu <***@xiaomi.com>
---
include/linux/cma.h | 7 +++++++
kernel/sysctl.c | 27 +++++++++++++++++++++++++++
mm/cma.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 88 insertions(+)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 0430ed0..df96abf 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -15,6 +15,13 @@

struct cma;

+#ifdef CONFIG_CMA_AGGRESSIVE
+extern atomic_t cma_alloc_counter;
+extern int cma_aggressive_switch;
+extern unsigned long cma_aggressive_free_min;
+extern int cma_aggressive_shrink_switch;
+#endif
+
extern phys_addr_t cma_get_base(struct cma *cma);
extern unsigned long cma_get_size(struct cma *cma);

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4aada6d..646929e2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -92,6 +92,10 @@
#include <linux/nmi.h>
#endif

+#ifdef CONFIG_CMA_AGGRESSIVE
+#include <linux/cma.h>
+#endif
+

#if defined(CONFIG_SYSCTL)

@@ -1485,6 +1489,29 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
},
+#ifdef CONFIG_CMA_AGGRESSIVE
+ {
+ .procname = "cma-aggressive-switch",
+ .data = &cma_aggressive_switch,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = proc_dointvec,
+ },
+ {
+ .procname = "cma-aggressive-free-min",
+ .data = &cma_aggressive_free_min,
+ .maxlen = sizeof(unsigned long),
+ .mode = 0600,
+ .proc_handler = proc_doulongvec_minmax,
+ },
+ {
+ .procname = "cma-aggressive-shrink-switch",
+ .data = &cma_aggressive_shrink_switch,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = proc_dointvec,
+ },
+#endif
{ }
};

diff --git a/mm/cma.c b/mm/cma.c
index 963bc4a..566ed5f 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -33,6 +33,7 @@
#include <linux/log2.h>
#include <linux/cma.h>
#include <linux/highmem.h>
+#include <linux/swap.h>

struct cma {
unsigned long base_pfn;
@@ -127,6 +128,27 @@ err:
return -EINVAL;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+/* The counter for the dma_alloc_from_contiguous and
+ dma_release_from_contiguous. */
+atomic_t cma_alloc_counter = ATOMIC_INIT(0);
+
+/* Swich of CMA_AGGRESSIVE. */
+int cma_aggressive_switch __read_mostly;
+
+/* If the number of CMA free pages is small than this value, CMA_AGGRESSIVE will
+ not work. */
+#ifdef CONFIG_CMA_AGGRESSIVE_FREE_MIN
+unsigned long cma_aggressive_free_min __read_mostly =
+ CONFIG_CMA_AGGRESSIVE_FREE_MIN;
+#else
+unsigned long cma_aggressive_free_min __read_mostly = 500;
+#endif
+
+/* Swich of CMA_AGGRESSIVE shink. */
+int cma_aggressive_shrink_switch __read_mostly;
+#endif
+
static int __init cma_init_reserved_areas(void)
{
int i;
@@ -138,6 +160,22 @@ static int __init cma_init_reserved_areas(void)
return ret;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+ cma_aggressive_switch = 0;
+#ifdef CONFIG_CMA_AGGRESSIVE_PHY_MAX
+ if (memblock_phys_mem_size() <= CONFIG_CMA_AGGRESSIVE_PHY_MAX)
+#else
+ if (memblock_phys_mem_size() <= 0x40000000)
+#endif
+ cma_aggressive_switch = 1;
+
+ cma_aggressive_shrink_switch = 0;
+#ifdef CONFIG_CMA_AGGRESSIVE_SHRINK
+ if (cma_aggressive_switch)
+ cma_aggressive_shrink_switch = 1;
+#endif
+#endif
+
return 0;
}
core_initcall(cma_init_reserved_areas);
@@ -312,6 +350,11 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
int ret;
+#ifdef CONFIG_CMA_AGGRESSIVE
+ int free = global_page_state(NR_FREE_PAGES)
+ - global_page_state(NR_FREE_CMA_PAGES)
+ - totalreserve_pages;
+#endif

if (!cma || !cma->count)
return NULL;
@@ -326,6 +369,13 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
bitmap_maxno = cma_bitmap_maxno(cma);
bitmap_count = cma_bitmap_pages_to_bits(cma, count);

+#ifdef CONFIG_CMA_AGGRESSIVE
+ atomic_inc(&cma_alloc_counter);
+ if (cma_aggressive_switch && cma_aggressive_shrink_switch
+ && free < count)
+ shrink_all_memory(count - free, false);
+#endif
+
for (;;) {
mutex_lock(&cma->lock);
bitmap_no = bitmap_find_next_zero_area(cma->bitmap,
@@ -361,6 +411,10 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
start = bitmap_no + mask + 1;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+ atomic_dec(&cma_alloc_counter);
+#endif
+
pr_debug("%s(): returned %p\n", __func__, page);
return page;
}

--
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Hui Zhu

2014-10-17 09:30:05 UTC

Update this patch according to the comments from Rafael.

Add cma_alloc_counter, cma_aggressive_switch, cma_aggressive_free_min and
cma_aggressive_shrink_switch.

cma_aggressive_switch is the swith for all CMA_AGGRESSIVE function. It can be
controlled by sysctl vm.cma-aggressive-switch.

cma_aggressive_free_min can be controlled by sysctl
"vm.cma-aggressive-free-min". If the number of CMA free pages is small than
this sysctl value, CMA_AGGRESSIVE will not work in page alloc code.

cma_aggressive_shrink_switch can be controlled by sysctl
"vm.cma-aggressive-shrink-switch". If sysctl "vm.cma-aggressive-shrink-switch"
is true and free normal memory's size is smaller than the size that it want to
allocate, do memory shrink with function git commit -a --amend before driver
allocate pages from CMA.

When Linux kernel try to reserve custom contiguous area, increase the value of
cma_alloc_counter. CMA_AGGRESSIVE will not work in page alloc code.
After reserve custom contiguous area function return, decreases the value of
cma_alloc_counter.

Signed-off-by: Hui Zhu <***@xiaomi.com>
---
include/linux/cma.h | 7 +++++++
kernel/sysctl.c | 27 +++++++++++++++++++++++++++
mm/cma.c | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 88 insertions(+)

diff --git a/include/linux/cma.h b/include/linux/cma.h
index 0430ed0..df96abf 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -15,6 +15,13 @@

struct cma;

+#ifdef CONFIG_CMA_AGGRESSIVE
+extern atomic_t cma_alloc_counter;
+extern int cma_aggressive_switch;
+extern unsigned long cma_aggressive_free_min;
+extern int cma_aggressive_shrink_switch;
+#endif
+
extern phys_addr_t cma_get_base(struct cma *cma);
extern unsigned long cma_get_size(struct cma *cma);

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4aada6d..646929e2 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -92,6 +92,10 @@
#include <linux/nmi.h>
#endif

+#ifdef CONFIG_CMA_AGGRESSIVE
+#include <linux/cma.h>
+#endif
+

#if defined(CONFIG_SYSCTL)

@@ -1485,6 +1489,29 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
},
+#ifdef CONFIG_CMA_AGGRESSIVE
+ {
+ .procname = "cma-aggressive-switch",
+ .data = &cma_aggressive_switch,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = proc_dointvec,
+ },
+ {
+ .procname = "cma-aggressive-free-min",
+ .data = &cma_aggressive_free_min,
+ .maxlen = sizeof(unsigned long),
+ .mode = 0600,
+ .proc_handler = proc_doulongvec_minmax,
+ },
+ {
+ .procname = "cma-aggressive-shrink-switch",
+ .data = &cma_aggressive_shrink_switch,
+ .maxlen = sizeof(int),
+ .mode = 0600,
+ .proc_handler = proc_dointvec,
+ },
+#endif
{ }
};

diff --git a/mm/cma.c b/mm/cma.c
index 963bc4a..1cf341c 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -33,6 +33,7 @@
#include <linux/log2.h>
#include <linux/cma.h>
#include <linux/highmem.h>
+#include <linux/swap.h>

struct cma {
unsigned long base_pfn;
@@ -127,6 +128,27 @@ err:
return -EINVAL;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+/* The counter for the dma_alloc_from_contiguous and
+ dma_release_from_contiguous. */
+atomic_t cma_alloc_counter = ATOMIC_INIT(0);
+
+/* Swich of CMA_AGGRESSIVE. */
+int cma_aggressive_switch __read_mostly;
+
+/* If the number of CMA free pages is small than this value, CMA_AGGRESSIVE will
+ not work. */
+#ifdef CONFIG_CMA_AGGRESSIVE_FREE_MIN
+unsigned long cma_aggressive_free_min __read_mostly =
+ CONFIG_CMA_AGGRESSIVE_FREE_MIN;
+#else
+unsigned long cma_aggressive_free_min __read_mostly = 500;
+#endif
+
+/* Swich of CMA_AGGRESSIVE shink. */
+int cma_aggressive_shrink_switch __read_mostly;
+#endif
+
static int __init cma_init_reserved_areas(void)
{
int i;
@@ -138,6 +160,22 @@ static int __init cma_init_reserved_areas(void)
return ret;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+ cma_aggressive_switch = 0;
+#ifdef CONFIG_CMA_AGGRESSIVE_PHY_MAX
+ if (memblock_phys_mem_size() <= CONFIG_CMA_AGGRESSIVE_PHY_MAX)
+#else
+ if (memblock_phys_mem_size() <= 0x40000000)
+#endif
+ cma_aggressive_switch = 1;
+
+ cma_aggressive_shrink_switch = 0;
+#ifdef CONFIG_CMA_AGGRESSIVE_SHRINK
+ if (cma_aggressive_switch)
+ cma_aggressive_shrink_switch = 1;
+#endif
+#endif
+
return 0;
}
core_initcall(cma_init_reserved_areas);
@@ -312,6 +350,11 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
struct page *page = NULL;
int ret;
+#ifdef CONFIG_CMA_AGGRESSIVE
+ int free = global_page_state(NR_FREE_PAGES)
+ - global_page_state(NR_FREE_CMA_PAGES)
+ - totalreserve_pages;
+#endif

if (!cma || !cma->count)
return NULL;
@@ -326,6 +369,13 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
bitmap_maxno = cma_bitmap_maxno(cma);
bitmap_count = cma_bitmap_pages_to_bits(cma, count);

+#ifdef CONFIG_CMA_AGGRESSIVE
+ atomic_inc(&cma_alloc_counter);
+ if (cma_aggressive_switch && cma_aggressive_shrink_switch
+ && free < count)
+ shrink_all_memory_for_cma(count - free);
+#endif
+
for (;;) {
mutex_lock(&cma->lock);
bitmap_no = bitmap_find_next_zero_area(cma->bitmap,
@@ -361,6 +411,10 @@ struct page *cma_alloc(struct cma *cma, int count, unsigned int align)
start = bitmap_no + mask + 1;
}

+#ifdef CONFIG_CMA_AGGRESSIVE
+ atomic_dec(&cma_alloc_counter);
+#endif
+
pr_debug("%s(): returned %p\n", __func__, page);
return page;
}

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Hui Zhu

2014-10-16 03:35:51 UTC

If page alloc function __rmqueue try to get pages from MIGRATE_MOVABLE and
conditions (cma_alloc_counter, cma_aggressive_free_min, cma_alloc_counter)
allow, MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first.

Signed-off-by: Hui Zhu <***@xiaomi.com>
---
mm/page_alloc.c | 42 +++++++++++++++++++++++++++++++-----------
1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 736d8e1..87bc326 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -65,6 +65,10 @@
#include <asm/div64.h>
#include "internal.h"

+#ifdef CONFIG_CMA_AGGRESSIVE
+#include <linux/cma.h>
+#endif
+
/* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */
static DEFINE_MUTEX(pcp_batch_high_lock);
#define MIN_PERCPU_PAGELIST_FRACTION (8)
@@ -1189,20 +1193,36 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order,
{
struct page *page;

-retry_reserve:
+#ifdef CONFIG_CMA_AGGRESSIVE
+ if (cma_aggressive_switch
+ && migratetype == MIGRATE_MOVABLE
+ && atomic_read(&cma_alloc_counter) == 0
+ && global_page_state(NR_FREE_CMA_PAGES) > cma_aggressive_free_min
+ + (1 << order))
+ migratetype = MIGRATE_CMA;
+#endif
+retry:
page = __rmqueue_smallest(zone, order, migratetype);

- if (unlikely(!page) && migratetype != MIGRATE_RESERVE) {
- page = __rmqueue_fallback(zone, order, migratetype);
+ if (unlikely(!page)) {
+#ifdef CONFIG_CMA_AGGRESSIVE
+ if (migratetype == MIGRATE_CMA) {
+ migratetype = MIGRATE_MOVABLE;
+ goto retry;
+ }
+#endif
+ if (migratetype != MIGRATE_RESERVE) {
+ page = __rmqueue_fallback(zone, order, migratetype);

- /*
- * Use MIGRATE_RESERVE rather than fail an allocation. goto
- * is used because __rmqueue_smallest is an inline function
- * and we want just one call site
- */
- if (!page) {
- migratetype = MIGRATE_RESERVE;
- goto retry_reserve;
+ /*
+ * Use MIGRATE_RESERVE rather than fail an allocation.
+ * goto is used because __rmqueue_smallest is an inline
+ * function and we want just one call site
+ */
+ if (!page) {
+ migratetype = MIGRATE_RESERVE;
+ goto retry;
+ }
}
}

--
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Joonsoo Kim

2014-10-24 05:28:50 UTC

Post by Hui Zhu
If page alloc function __rmqueue try to get pages from MIGRATE_MOVABLE and
conditions (cma_alloc_counter, cma_aggressive_free_min, cma_alloc_counter)
allow, MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first.
---
mm/page_alloc.c | 42 +++++++++++++++++++++++++++++++-----------
1 file changed, 31 insertions(+), 11 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 736d8e1..87bc326 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -65,6 +65,10 @@
#include <asm/div64.h>
#include "internal.h"
+#ifdef CONFIG_CMA_AGGRESSIVE
+#include <linux/cma.h>
+#endif
+
/* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */
static DEFINE_MUTEX(pcp_batch_high_lock);
#define MIN_PERCPU_PAGELIST_FRACTION (8)
@@ -1189,20 +1193,36 @@ static struct page *__rmqueue(struct zone *zone, unsigned int order,
{
struct page *page;
+#ifdef CONFIG_CMA_AGGRESSIVE
+ if (cma_aggressive_switch
+ && migratetype == MIGRATE_MOVABLE
+ && atomic_read(&cma_alloc_counter) == 0
+ && global_page_state(NR_FREE_CMA_PAGES) > cma_aggressive_free_min
+ + (1 << order))
+ migratetype = MIGRATE_CMA;
+#endif

I don't get it why cma_alloc_counter should be tested.
When cma alloc is progress, pageblock is isolated so that pages on that
pageblock cannot be allocated. Why should we prevent aggressive
allocation in this case?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Weijie Yang

2014-10-16 05:13:57 UTC

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.

I'm not very clear to this description, what issue do you try to solve?
Make MIGRATE_CMA be the fallback of desired MIGRATE_UNMOVABLE?

Post by Hui Zhu
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.

I don't think so. That will cause MIGRATE_CMA depleted prematurely, and when a
user(such as camera) wants CMA memory, he will not get the wanted memory.

Post by Hui Zhu
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

I don't think this is the root cause of oom.
But I am interested in the CMA shrinker idea, I will follow this mail.

Thanks for your work, add some test data will be better.

Post by Hui Zhu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

Laura Abbott

2014-10-16 08:55:46 UTC

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

It's good to see another proposal to fix CMA utilization. Do you have
any data about the success rate of CMA contiguous allocation after
this patch series? I played around with a similar approach of using
CMA for MIGRATE_MOVABLE allocations and found that although utilization
did increase, contiguous allocations failed at a higher rate and were
much slower. I see what this series is trying to do with avoiding
allocation from CMA pages when a contiguous allocation is progress.
My concern is that there would still be problems with contiguous
allocation after all the MIGRATE_MOVABLE fallback has happened.

Thanks,
Laura

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

朱辉

2014-10-17 07:44:26 UTC

Post by Laura Abbott

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

It's good to see another proposal to fix CMA utilization.

Thanks Laura.

Do you have

Post by Laura Abbott
any data about the success rate of CMA contiguous allocation after
this patch series? I played around with a similar approach of using
CMA for MIGRATE_MOVABLE allocations and found that although utilization
did increase, contiguous allocations failed at a higher rate and were
much slower. I see what this series is trying to do with avoiding
allocation from CMA pages when a contiguous allocation is progress.
My concern is that there would still be problems with contiguous
allocation after all the MIGRATE_MOVABLE fallback has happened.

I did some test with the cma_alloc_counter and cma-aggressive-shrink in
a android board that has 1g memory. Run some apps to make free CMA
close to the value of cma_aggressive_free_min(500 pages). A driver
Begin to request CMA more than 10 times. Each time, it will request more
than 3000 pages.

I don't have established number for that because it is really hard to
get a fail. I think the success rate is over 95% at least.

And I think maybe the isolate fail has relation with page alloc and free
code. Maybe let zone->lock protect more code can handle this issue.

Thanks,
Hui

Post by Laura Abbott
Thanks,
Laura

��칻�&�~�&��+-��ݶ��w��˛��m�b��i�)��w*jg��ݢj/��z�ޖ��

Peter Hurley

2014-10-22 12:01:54 UTC

Post by Laura Abbott

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

It's good to see another proposal to fix CMA utilization. Do you have
any data about the success rate of CMA contiguous allocation after
this patch series? I played around with a similar approach of using
CMA for MIGRATE_MOVABLE allocations and found that although utilization
did increase, contiguous allocations failed at a higher rate and were
much slower. I see what this series is trying to do with avoiding
allocation from CMA pages when a contiguous allocation is progress.
My concern is that there would still be problems with contiguous
allocation after all the MIGRATE_MOVABLE fallback has happened.

What impact does this series have on x86 platforms now that CMA is the
backup allocator for all iommu dma allocations?

Regards,
Peter Hurley

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to ***@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"***@kvack.org"> ***@kvack.org </a>

朱辉

2014-10-23 00:40:57 UTC

Post by Peter Hurley

Post by Laura Abbott

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

It's good to see another proposal to fix CMA utilization. Do you have
any data about the success rate of CMA contiguous allocation after
this patch series? I played around with a similar approach of using
CMA for MIGRATE_MOVABLE allocations and found that although utilization
did increase, contiguous allocations failed at a higher rate and were
much slower. I see what this series is trying to do with avoiding
allocation from CMA pages when a contiguous allocation is progress.
My concern is that there would still be problems with contiguous
allocation after all the MIGRATE_MOVABLE fallback has happened.

What impact does this series have on x86 platforms now that CMA is the
backup allocator for all iommu dma allocations?

They will not affect driver CMA memory allocation.

Thanks,
Hui

Post by Peter Hurley
Regards,
Peter Hurley

��{.n�+��+%��lzwm��b�맲��r��zX��f�{ay�ʇڙ�,j��f��h��z��w��

Joonsoo Kim

2014-10-24 05:25:53 UTC

Post by Hui Zhu
In fallbacks of page_alloc.c, MIGRATE_CMA is the fallback of
MIGRATE_MOVABLE.
MIGRATE_MOVABLE will use MIGRATE_CMA when it doesn't have a page in
order that Linux kernel want.
If a system that has a lot of user space program is running, for
instance, an Android board, most of memory is in MIGRATE_MOVABLE and
allocated. Before function __rmqueue_fallback get memory from
MIGRATE_CMA, the oom_killer will kill a task to release memory when
kernel want get MIGRATE_UNMOVABLE memory because fallbacks of
MIGRATE_UNMOVABLE are MIGRATE_RECLAIMABLE and MIGRATE_MOVABLE.
This status is odd. The MIGRATE_CMA has a lot free memory but Linux
kernel kill some tasks to release memory.
This patch series adds a new function CMA_AGGRESSIVE to make CMA memory
be more aggressive about allocation.
If function CMA_AGGRESSIVE is available, when Linux kernel call function
__rmqueue try to get pages from MIGRATE_MOVABLE and conditions allow,
MIGRATE_CMA will be allocated as MIGRATE_MOVABLE first. If MIGRATE_CMA
doesn't have enough pages for allocation, go back to allocate memory from
MIGRATE_MOVABLE.
Then the memory of MIGRATE_MOVABLE can be kept for MIGRATE_UNMOVABLE and
MIGRATE_RECLAIMABLE which doesn't have fallback MIGRATE_CMA.

Hello,

I did some work similar to this.
Please reference following links.

https://lkml.org/lkml/2014/5/28/64
https://lkml.org/lkml/2014/5/28/57

And, aggressive allocation should be postponed until freepage counting
bug is fixed, because aggressive allocation enlarge the possiblity
of problem occurence. I tried to fix that bug, too. See following link.

https://lkml.org/lkml/2014/10/23/90

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

18 Replies
100 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Hui Zhu 2014-10-16 03:35:47 UTC

Hui Zhu 2014-10-16 03:35:48 UTC

Pavel Machek 2014-10-18 22:15:25 UTC

朱辉 2014-10-22 05:44:24 UTC

Hui Zhu 2014-10-16 03:35:49 UTC

Rafael J. Wysocki 2014-10-16 08:45:21 UTC

朱辉 2014-10-17 06:18:39 UTC

Hui Zhu 2014-10-17 09:28:04 UTC

PINTU KUMAR 2014-10-18 04:50:08 UTC

Hui Zhu 2014-10-16 03:35:50 UTC

Hui Zhu 2014-10-17 09:30:05 UTC

Hui Zhu 2014-10-16 03:35:51 UTC

Joonsoo Kim 2014-10-24 05:28:50 UTC

Weijie Yang 2014-10-16 05:13:57 UTC

Laura Abbott 2014-10-16 08:55:46 UTC

朱辉 2014-10-17 07:44:26 UTC

Peter Hurley 2014-10-22 12:01:54 UTC

朱辉 2014-10-23 00:40:57 UTC

Joonsoo Kim 2014-10-24 05:25:53 UTC

about - legalese

Loading...