This post shows how to figure out the values in stack using crash utility.
Let me explain a scenario:
Speculation: Process named Binder_5 is in the runqueue holding cgroup_mutex and looping there infinitely.
Below is the do while code which I m suspecting,
File: kerne/cgroups.c
Function: cgroup_attach_task
kernel version:3.10.17
Figure 1:
Backtrace of Binder_5 task is as follows,
Figure 2:
So from the do while, task_cgroup_from_root is called and this process(binder_5) was preempted out. It was because of some other high priority task might have come.
To confirm that we are in this loop for ever, we need to check the if condition(marked in red box in figure 1). Lets take the objdump of vmlinux and see how to figure out the value of ent.cgrp and cgrp. Both will be stored in stack.
Figure 3 show the partial objdump of function cgroup_attach_task(corresponding c function is shown in figure 1). It is taken using the following command.
$ arm-none-linux-gnueabi-objdump -D vmlinux > vmlinux_objdump.txt
Figure 3:
Now we need to find the value of r5 and r0. And if they are same, then my speculation of infinite loop is correct. Tough job ahead :-). Lets have fun.
Finding the value of r5:
If we are lucky the value of r5 will be pushed in to stack during entry of function task_cgroup_from_root.
Here is the objdump:
Figure 4:
Yes it is pushed to the stack.
Now figure 5 explains below how to find out the value of r5.
The left part of the figure 5 is created by printing the frame of task binder_5 using crash tool.
The command used,
crash> bt -f 2077
whree 2077 is the pid of process binder_5
Figure :5
push {r4, r5, fp, ip, lr, pc}
In above instruction, pc will be stored at highest address in stack, followed by lr. And so on.
This is because arm linux stack type is full descending.
So value of r5 = 0xed306c00.
Finding the value of r0:
Now from figure 3, r0 is also stored at fp - 84 th location(marked in red box). so if we look at that location we could get what was the value of r0 when task_cgroup_from_root returned last time.
So for this, we need to find the value of fp first. This will be stored in the frame of task_cgroup_from_root. Same way as our r5 was stored.
Below figure shows how we found the value of fp.
The left side of figure 6 show the backtrace with frames same as figure 5. This time I used the notepad++.
Figure :6
So value of r0 and r5 are same and the speculation of of infinite loop is proved.
This was a bug in the kernel 3.10 and is fixed later. commit
Thanks,
Arun
Let me explain a scenario:
Speculation: Process named Binder_5 is in the runqueue holding cgroup_mutex and looping there infinitely.
Below is the do while code which I m suspecting,
File: kerne/cgroups.c
Function: cgroup_attach_task
kernel version:3.10.17
Figure 1:
Backtrace of Binder_5 task is as follows,
Figure 2:
So from the do while, task_cgroup_from_root is called and this process(binder_5) was preempted out. It was because of some other high priority task might have come.
To confirm that we are in this loop for ever, we need to check the if condition(marked in red box in figure 1). Lets take the objdump of vmlinux and see how to figure out the value of ent.cgrp and cgrp. Both will be stored in stack.
Figure 3 show the partial objdump of function cgroup_attach_task(corresponding c function is shown in figure 1). It is taken using the following command.
$ arm-none-linux-gnueabi-objdump -D vmlinux > vmlinux_objdump.txt
Figure 3:
Now we need to find the value of r5 and r0. And if they are same, then my speculation of infinite loop is correct. Tough job ahead :-). Lets have fun.
Finding the value of r5:
If we are lucky the value of r5 will be pushed in to stack during entry of function task_cgroup_from_root.
Here is the objdump:
Figure 4:
Yes it is pushed to the stack.
Now figure 5 explains below how to find out the value of r5.
The left part of the figure 5 is created by printing the frame of task binder_5 using crash tool.
The command used,
crash> bt -f 2077
whree 2077 is the pid of process binder_5
Figure :5
push {r4, r5, fp, ip, lr, pc}
In above instruction, pc will be stored at highest address in stack, followed by lr. And so on.
This is because arm linux stack type is full descending.
So value of r5 = 0xed306c00.
Finding the value of r0:
Now from figure 3, r0 is also stored at fp - 84 th location(marked in red box). so if we look at that location we could get what was the value of r0 when task_cgroup_from_root returned last time.
So for this, we need to find the value of fp first. This will be stored in the frame of task_cgroup_from_root. Same way as our r5 was stored.
Below figure shows how we found the value of fp.
The left side of figure 6 show the backtrace with frames same as figure 5. This time I used the notepad++.
Figure :6
So value of r0 and r5 are same and the speculation of of infinite loop is proved.
This was a bug in the kernel 3.10 and is fixed later. commit
Thanks,
Arun






 
 
1 comment:
Good work Arun .. good post to learn kernel hang situation..
Post a Comment