Crash log snippet of binder crash,
Before you start make sure the ramdump and vmlinux matches by using strings command,
strings vmlinux | grep "Linux version"
[49081.080352] Unable to handle kernel NULL pointer dereference at
virtual address 00000030
[49081.088500] pgd = d748c000
[49081.091186] [00000030] *pgd=00000000
[49081.094787] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[49081.100067] kona_fb: die notifier invoked
[49081.104064] kona_display_crash_image:359 image_idx=4
[49081.109863] post_async:540 AXIPV has received the same buffer
which is used
[49081.116790] post_async:541 It is likely that we see display
tearing
[49081.123046] post_async:542 Preventing a potential lock-up by
signalling
[49081.129669] Modules linked in: [last unloaded: bcmdhd_4330]
[49081.135253] CPU: 1 Tainted:
G W (3.4.5+
#1)
[49081.140655] PC is at binder_get_node+0x8/0x34
[49081.144989] LR is at binder_thread_write+0x147c/0x239c
[49081.150115] pc : [] lr :
[] psr: 20000053
[49081.150146] sp : f43a3e10 ip : f8f004fc fp :
00000000
[49081.161560] r10: f8f004c4 r9 : 73622a85 r8 :
00000064
[49081.166778] r7 : f43a2000 r6 : f8f00460 r5 :
411e36f8 r4 : c0a69260
[49081.173278] r3 : 00000010 r2 : 411e3858 r1 : 411e36f8 r0 : f32c7a40
[49081.179779] Flags: nzCv IRQs on FIQs off Mode
SVC_32 ISA ARM Segment user
[49081.186981] Control: 10c53c7d Table: 9948c04a DAC:
00000015
Disassemble of binder_get_node,
crash> dis binder_get_node
0xc03dc4a8 :
ldr r3, [r0, #12]
0xc03dc4ac :
b 0xc03dc4cc
0xc03dc4b0 : ldr
r2, [r3, #32]
0xc03dc4b4
:
sub r0, r3, #16
0xc03dc4b8
:
cmp r1, r2
0xc03dc4bc
:
ldrcc r3, [r3, #8]
0xc03dc4c0
:
bcc 0xc03dc4cc
0xc03dc4c4
:
bxls lr
0xc03dc4c8
:
ldr r3, [r3, #4]
0xc03dc4cc
:
cmp r3, #0
0xc03dc4d0 :
bne 0xc03dc4b0
0xc03dc4d4
:
mov r0, r3
0xc03dc4d8
:
bx lr
crash>
ldr r3,[r0, #12]
From crash log above, r0 = f32c7a40
To confirm read memory 0xf32c7a40 + 0xc using crash,
crash> rd f32c7a4c
f32c7a4c:
00008000
....
crash>
But r3 is 0x10. Which is not matching with 0x8000. So this means
r0 has changed (since ldr r3,[r0, #12]). Proves that crash not happened
in first instance. It had looped for some time
C code binder_get_node,
static struct binder_node *binder_get_node(struct binder_proc
*proc,
void
__user *ptr)
{
struct rb_node *n =
proc->nodes.rb_node;
struct binder_node
*node;
while (n) {
node = rb_entry(n, struct binder_node, rb_node);
if (ptr < node->ptr)
n = n->rb_left;
else if (ptr > node->ptr)
n = n->rb_right;
else
return node;
}
return NULL;
}
Back trace from crash log,
[49081.971740] [] (binder_get_node+0x8/0x34) from
[] (binder_thread_write+0x147c/0x239c)
[49081.981262] []
(binder_thread_write+0x147c/0x239c) from []
(binder_ioctl+0x37c/0x974)
[49081.990814] [] (binder_ioctl+0x37c/0x974) from
[] (do_vfs_ioctl+0x4d8/0x54c)
[49081.999572] [] (do_vfs_ioctl+0x4d8/0x54c) from
[] (sys_ioctl+0x34/0x54)
[49082.007904] [] (sys_ioctl+0x34/0x54) from
[] (ret_fast_syscall+0x0/0x48)
Since the back trace was showing a sys_ioctl we iterated on
stack to pick up the struct file * associated with.
Here is the hit,
crash> struct file f37cbef0
struct file {
f_u = {
fu_list = {
next = 0xf37cb550,
prev = 0xf2a0def0
},
fu_rcuhead = {
next = 0xf37cb550,
func = 0xf2a0def0
}
},
f_path = {
mnt = 0xf5015e58,
dentry = 0xf3cc5d08
},
f_op = 0xc064dc0c,
----sniped---
f_version = 0,
private_data = 0xf2e9ca00,
f_ep_links = {
next = 0xf37cbf74,
prev = 0xf37cbf74
},
f_tfile_llink = {
next = 0xf37cbf7c,
prev = 0xf37cbf7c
},
f_mapping = 0xf4090ad0
}
f_op symbol says it is binder_fops,
crash> sym 0xc064dc0c
c064dc0c (r) binder_fops
crash>
From code(drivers/staging/android/binder.c +3012) we figured out
that private_data is binder_proc
crash> struct binder_proc 0xf2e9ca00
struct binder_proc {
proc_node = {
next = 0xf3514200,
pprev = 0xc0a69c84
},
threads = {
rb_node = 0xf2a10c84
},
nodes = {
rb_node = 0xf2d1a7d0
},
refs_by_desc = {
rb_node = 0xf3126284
},
refs_by_node = {
From the analysis of the code, we understand that crash happened
while traversing the rb tree.
So tried to parse the rb tree using crash,
crash> tree -t rbtree -N 0xf2d1a7d0 -p
f2d1a7d0
position: root
f2a00ad0
position: root/l
f2a90e50
position: root/r
f32c7a50
position: root/r/r
10
position: root/r/r/l
tree: invalid kernel virtual address: 18 type: "rb_node
rb_left"
crash>
This shows that rb tress has a corrupted address to next left
node.
0x10 which is the same value in r3 during crash.
And crash happened because of address 0x30.
[49081.080352] Unable to handle kernel NULL pointer dereference at
virtual address 00000030
This explains why crashed.
0xc03dc4b0 : ldr
r2, [r3, #32]
R3 + #32 = 0x10 + 0x20 = 0x30
Dumping the corrupted struct rb_node from memory,
crash> struct rb_node f32c7a50
struct rb_node {
rb_parent_color = 44100,
rb_right = 0x2,
rb_left = 0x10
}
crash>
This shows the rb tree is corrupted. Because rb_left and
rb_right should have a valid kernel address or 0.
Dumping the memory from f32c7a00, 100 elements to inspect the
memory,
crash> rd f32c7a00 100
f32c7a00: f32c79fc 00000000 00000000 00000000
.y,.............
f32c7a10: 00000000 00000000 00000000 00000000
................
f32c7a20: 00000000 00000000 00000000 00000000
................
f32c7a30: 00000000 00000000 00000000 00000000
................
f32c7a40: 01bb5e6b c03eed84 f4727600 00008000
k^....>..vr.....
f32c7a50: 0000ac44 00000002 00000010 f5c10000
D...............
f32c7a60: 00010000 b7c10000 00000000 00000000
................
f32c7a70: 411e3858 411e3840 00001fe5 00000002 X8.A@8.A........
f32c7a80: f32c7a7c 00000001 00000000 00000000 |z,.............
f32c7a90: 00000000 00000000 00000000 00000000
................
f32c7aa0: 00000000 00000000 00000000 00000000
................
f32c7ab0: 00000000 00000000 00000000 00000000
................
f32c7ac0: 00000000 00000000 f32c7ac8 f32c7ac8
.........z,..z,.
f32c7ad0: 00000000 00000000 00000010 00000001
................
f32c7ae0: 00000001 00000000 00000001 00000001
................
f32c7af0: 00000000 00000000 00000000 f3faf330
............0...
f32c7b00: f3a6c000 00000000 00000000 00000000
................
f32c7b10: 00000000 00000000 00000000 00000000
................
f32c7b20: 00000000 00000000 00000000 00000000
................
f32c7b30: 00000000 00000000 00000000 00000000
................
f32c7b40: 006aefa7 f32c7b44 f32c7b44 00000003
..j.D{,.D{,.....
f32c7b50: f2bf1450 f3bef7d0 00000000 00000000
P...............
f32c7b60: f3854bdc 00000001 00000000 00000000
.K..............
f32c7b70: 54d4c670 54d3f828 00001fe5 f32c7b7c
p..T(..T....|{,.
f32c7b80: f32c7b7c 00000000 00000000 00000000
|{,.............
Dumping with symbols,
crash> rd -s f32c7a00 100
f32c7a00: __crc_v4l2_norm_to_name+482177 00000000 00000000
00000000
f32c7a10: 00000000 00000000 00000000 00000000
f32c7a20: 00000000 00000000 00000000 00000000
f32c7a30: 00000000 00000000 00000000 00000000
f32c7a40: 01bb5e6b AUDIO_DRIVER_InterruptPeriodCB __crc_bmp18x_disable+1706299
00008000
f32c7a50: 0000ac44 00000002 00000010
__crc_generic_segment_checks+42732
f32c7a60: 00010000 b7c10000 00000000 00000000
f32c7a70: 411e3858 411e3840 00001fe5 00000002
f32c7a80: __crc_v4l2_norm_to_name+482305 00000001 00000000
00000000
f32c7a90: 00000000 00000000 00000000 00000000
f32c7aa0: 00000000 00000000 00000000 00000000
f32c7ab0: 00000000 00000000 00000000 00000000
f32c7ac0: 00000000 00000000 __crc_v4l2_norm_to_name+482381
__crc_v4l2_norm_to_name+482381
f32c7ad0: 00000000 00000000 00000010 00000001
f32c7ae0: 00000001 00000000 00000001 00000001
f32c7af0: 00000000 00000000 00000000
__crc_qdisc_class_hash_grow+482810
f32c7b00: __crc_rt_mutex_trylock+115850 00000000 00000000
00000000
f32c7b10: 00000000 00000000 00000000 00000000
f32c7b20: 00000000 00000000 00000000 00000000
f32c7b30: 00000000 00000000 00000000 00000000
f32c7b40: 006aefa7 __crc_v4l2_norm_to_name+482505
__crc_v4l2_norm_to_name+482505 00000003
f32c7b50: __crc_usb_get_descriptor+260561
__crc_bdi_set_max_ratio+149137 00000000 00000000
f32c7b60: __crc_snd_interval_ratnum+776842 00000001 00000000
00000000
f32c7b70: 54d4c670 54d3f828 00001fe5
__crc_v4l2_norm_to_name+482561
f32c7b80: __crc_v4l2_norm_to_name+482561 00000000 00000000
00000000
Somebody has over written from f32c7a40
This causes rb_tree corruption.
Surprisingly there is an audio function address in the corrupted
memory as highlighted above.
From the kernel logs we can see audio and rpc errors just before
the crash.
Eg:
[48902.141143] WARNING: at
/build/REL_sdb-common-android-jb-4.2.2_428843/android_hawaii_garnet_edn010/kernel/hawaii/3.4.5/sound/caph_hawaii/caph_pcm.c:248
PcmHwFree+0x8c/0xdc()
[48902.157012] ERROR timeout waiting for STOP REQ.t=30000 ret=0
[48902.162719] Modules linked in: [last unloaded: bcmdhd_4330]
[48902.168579] [] (unwind_backtrace+0x0/0x11c)
from [] (warn_slowpath_common+0x4c/0x64)
[48902.178192] [] (warn_slowpath_common+0x4c/0x64)
from [] (warn_slowpath_fmt+0x2c/0x3c)
[48902.187957] [] (warn_slowpath_fmt+0x2c/0x3c)
from [] (PcmHwFree+0x8c/0xdc)
[48902.196716] [] (PcmHwFree+0x8c/0xdc) from
[] (snd_pcm_release_substream+0x44/0xa4)
[48902.206512] []
(snd_pcm_release_substream+0x44/0xa4) from []
(snd_pcm_release+0x28/0x6c)
[48902.216735] [] (snd_pcm_release+0x28/0x6c) from
[] (fput+0x10c/0x214)
[48902.224914] [] (fput+0x10c/0x214) from
[] (filp_close+0x70/0x7c)
[48902.232879] [] (filp_close+0x70/0x7c) from
[] (put_files_struct+0x90/0xec)
[48902.241668] [] (put_files_struct+0x90/0xec)
from [] (do_exit+0x280/0x7e0)
[48902.250671] [] (do_exit+0x280/0x7e0) from
[] (do_group_exit+0x88/0xb4)
[48902.259338] [] (do_group_exit+0x88/0xb4) from
[] (get_signal_to_deliver+0x510/0x568)
[48902.269470] [] (get_signal_to_deliver+0x510/0x568)
from [] (do_signal+0xc4/0x518)
[48902.278808] [] (do_signal+0xc4/0x518) from
[] (do_notify_resume+0x18/0x50)
[48902.287567] [] (do_notify_resume+0x18/0x50)
from [] (work_pending+0x24/0x28)
Conclusion is one possible suspect for corruption can be Audio.
8 comments:
nice write-up Arun. can you please let me know which tool did you use where crash> prompt is coming??
hey..I think I found it..crash tool right?? I never use this one.
Awesome post and its very informative for beginners;keep up the good work.
Thanks
Awesome post and its very informative for beginners ..
Are you running it on Android, ARM based?
Can you show me how to compile crash for ARM?
64 bit arm,
make target=arm64
32 bit arm,
make target=arm
HI Arun,
Interesting article.
For using crash urility you need vmcore right? Please let me know how you enabled vmcore in android.
Regards,
Ravi
Post a Comment