Friday, May 9, 2014

Tuesday, April 29, 2014

Interpreting values in stack using crash utility

This post shows how to figure out the values in stack using crash utility.

Let me explain a scenario:
Speculation: Process named Binder_5 is in the runqueue holding cgroup_mutex and looping there infinitely.

Below is the do while code which I m suspecting,
File: kerne/cgroups.c
Function: cgroup_attach_task
kernel version:3.10.17

                                                                    Figure 1:

Backtrace of Binder_5 task is as follows,

                                                                      Figure 2:

So from the do while, task_cgroup_from_root is called and this process(binder_5) was preempted out. It was because of some other high priority task might have come.

To confirm that we are in this loop for ever, we need to check the if condition(marked in red box in figure 1). Lets take the objdump of vmlinux and see how to figure out the value of ent.cgrp and cgrp. Both will be stored in stack.

Figure 3 show the partial objdump of function cgroup_attach_task(corresponding c function is shown in figure 1). It is taken using the following command.
$ arm-none-linux-gnueabi-objdump -D vmlinux > vmlinux_objdump.txt

                                                                      Figure 3:

Now we need to find the value of r5 and r0. And if they are same, then my speculation of infinite loop is correct. Tough job ahead :-). Lets have fun.

Finding the value of r5:
If we are lucky the value of r5 will be pushed in to stack during entry of function task_cgroup_from_root.
Here is the objdump:

                                                                     Figure 4:

Yes it is pushed to the stack.
Now figure 5 explains below how to find out the value of r5.
The left part of the figure 5 is created by printing the frame of task binder_5 using crash tool.
The command used,
crash> bt -f 2077
whree 2077 is the pid of process binder_5

                                                                                     Figure :5

push    {r4, r5, fp, ip, lr, pc}

In above instruction, pc will be stored at highest address in stack, followed by lr. And so on.
This is because arm linux stack type is full descending.

So value of r5 = 0xed306c00.

Finding the value of r0:
Now from figure 3, r0 is also stored at fp - 84 th location(marked in red box). so if we look at that location we could get what was the value of r0 when task_cgroup_from_root returned last time.

So for this, we need to find the value of fp first. This will be stored in the frame of task_cgroup_from_root. Same way as our r5 was stored.

Below figure shows how we found the value of fp.
The left side of figure 6 show the backtrace with frames same as figure 5. This time I used the notepad++.


                                                                                     Figure :6

So value of r0 and r5 are same and the speculation of of infinite loop is proved.
This was a bug in the kernel 3.10 and is fixed later. commit

Thanks,
Arun

Friday, March 28, 2014

RB Tree parsing in crash utility

Requirement: see all the virual address allocated to a process.

This is just for the demo purpose of tree command. If you want to do this you can do using command
crash>vm -p

from the ps output,

we pick the process ".android.chrome" whose task_struct address is 0xe22f6200

crash> struct task_struct 0xe22f6200
----strip-------
 mm = 0xedab1500,
----strip-------

we got the value of mm.

 crash> struct mm_struct 0xe22f6200
 struct mm_struct {          
  mmap = 0x1,                
  mm_rb = {                  
    rb_node = 0xc2b00000    
  },                        
  mmap_cache = 0x2,          

Now to list all vm_area(vm_start and vm_end)
crash> tree -t rbtree -r mm_struct.mm_rb 0xedab1500 -o 16 -s vm_area_struct.vm_start,vm_end

the value passed to -o is calulate as follows,
 crash> struct vm_area_struct.vm_rb  
struct vm_area_struct {              
  [16] struct rb_node vm_rb;        
}  

Monday, March 17, 2014

Crash Utility: How to view user space back trace


Step 1) Download gcore extension from,

Step 2) To build the module from the top-level crash- directory, enter:
  $ tar xzf crash-gcore-command-1.2.2.tar.gz
  $ mv crash-gcore-command-1.2.2/* extensions
  $ make extensions

Step 3) Extend gcore command as follows from crash command prompt,
crash> extend /projects/mobcom_andrwks_ext8/users/arunks/crashutility/src/crash_utility/extensions/gcore.so

Step 4) Use gcore command to dump the user core dump file
            $gcore 1291
            Where 1291 is the pid of the process. This creates a file core.1291.init

Step 5) Use gdb to view the back trace. Pass elf and the core dump file to gdb as follows
            $ arm-none-linux-gnueabi-gdb /projects/mobcom_andrwks_ext8/users/arunks/bringup_eos2b/android/out/target/product/amethyst/obj/EXECUTABLES/init_intermediates/LINKED/init core.1291.init

Step 6) view back trace by bt command.
(gdb) bt
#0  umount2 () at bionic/libc/arch-arm/syscalls/umount2.S:10
#1  0x0000e0a0 in check_fs (blk_device=0x648b0 "/dev/block/platform/e6bd0000.mmcif/by-name/MODEMLOG", fs_type=0x64900 "ext4", target=0x648e8 "/mnt/modemlog") at system/core/fs_mgr/fs_mgr.c:512
#2  0x0000e7ea in fs_mgr_mount_all (fstab=0x645c0) at system/core/fs_mgr/fs_mgr.c:512
#3  0x0000873a in do_mount_all (nargs=Unhandled dwarf expression opcode 0xf3
) at bionic/libc/include/string.h:217
#4  0x0000992a in execute_one_command () at system/core/init/init.c:939
#5  0x00009e56 in main (argc=Unhandled dwarf expression opcode 0xf3
) at system/core/init/init.c:939

(gdb)

Friday, March 14, 2014

Setting up Tmux


For anyone else who wants to install tmux to their home folder (as non root) here is what you need to do.

1) Download libevent and ncurses.
2) Compile them to $HOME/local (ie ./configure --prefix=$HOME/local, then make install)
3) Install tmux by the following:

cd tmux
./configure
CPPFLAGS="-I$HOME/local/include" LDFLAGS="-static -L$HOME/local/include -L$HOME/local/include/ncurses -L$HOME/local/lib" make
make install

4)./tmux

My .tmux.conf file can be found here,
https://github.com/getarunks/linux_config/blob/master/.tmux.conf


Wednesday, March 12, 2014

Bug in Undefined Instruction Handler ARM

Problem Statement:

Multimedia team reported random user space crash.
The issue was tough to reproduce. If you run monkey test on, say 7 devices for around 5-10 hours.
They also indicated about a pattern, that SP is shifted by 8bytes in tombstone.

Most of the time crash was reproduced while returning from a bionic libc function strtoimax().
When objdump of strotimax was investigated, found few instruction which may cause corruption.
Out of them the most important one was vpush {d8} and vpop {d8}. This is basically a floating point instruction.

To confirm that vpush and vpop is causing the issue, added a new function called strtoimax_debug in bionic libc. This was a dummy function which does noting, but check for any stack corruption. We added those suspected instruction here,
Like stmdb sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}, vpush and vpos
and check for the expected SP, if not generate an intentional data abort doing an ldr r0 [0]

From the result of this experiment, confirmed that the issue is with vpop instruction.

Now did bit study on how vfp instruction are executed. By default vfp engine is turned off druing a context switch. When a process executes floating point instruction, a undefined exception will be generated.
exception handler enables the vfp engine and jump back to the same instruction which caused the exception.
code looks okey in that perspective.

Tracing __und_usr(arch/arm/kernel/entry-armv.S) revealed that control came out of exception handler without performing do_vfp which is necessary to handle any vfp instruction when the engine is off.
This gave an indication that there can be only one possibility(for deviating the path) and that is an exception has triggered while executing the path.

Looking at the code further, saw that und_user is reading the instruction which cause the undefined exception. And comment says that this can cause a fault. Now the question is why it can falut?. Onepossible option is the code page might have reclaimed by other core. A fixup handler is register in the exception table. This is not the normal exception table. Here what I m talking is about kernel's fixup exception table.
But the fixup handler was not proper. It was calling ret_from_execption which retrun back to the next instruction. The problem is present in latest kernel aswell, but this problem is very rarely hit.

Now the fix is to return to the same instruction which cause fault instead of next instruction.

Some Notes:

The NEON/VFP register file is managed using lazy preserve (on UP systems) and
lazy restore (on both SMP and UP systems). This means that the register file is
kept 'live', and is only preserved and restored when multiple tasks are
contending for the NEON/VFP unit (or, in the SMP case, when a task migrates to
another core). Lazy restore is implemented by disabling the NEON/VFP unit after
every context switch, resulting in a trap when subsequently a NEON/VFP
instruction is issued, allowing the kernel to step in and perform the restore if
necessary.

Difference btw Neon and VFP
Neon donot support double precision
no complex operations like square root and divide.

Managed to push this fix upstream, [Link to kenrel.org]

Friday, February 21, 2014

Editing a Ramdisk


cp ramdisk.img ramdisk.cpio.gz
gzip -d ramdisk.cpio.gz
mkdir ramdisk
cd ramdisk
cp -Rfp ../ramdisk.cpio .
cpio -i -F ramdisk.cpio
rm ramdisk.cpio
ls should show all the files in ramdisk. From here on add files as you like in ramdisk
find . | cpio -o -H newc | gzip > ../ramdisk.img

If you want uncompressed ramdisk, remove gzip from last step,ie 
find . | cpio -o -H newc > ../ramdisk.img

Wednesday, February 19, 2014

Stack in ARM

Different types of Stack:

Descending and Ascending: The stack grows downwards, starting with higher address and progressing to lower one(a descending stack). or upwards, starting with lower address and progressing to higher one(a ascending stack).

Empty or Full Stack:
The stack pointer can either point to the last item in the stack(a full stack), or the next free space on the stack(an empty stack).



To make it easier for the programmer, stack-oriented suffixes can be used instead of the
increment or decrement, and before or after suffixes


For example:
STMFD sp!, {r0-r5} ; Push onto a Full Descending Stack
LDMFD sp!, {r0-r5} ; Pop from a Full Descending Stack

! signifies, final address is written back to sp.

The Procedure Call Standard for the ARM Architecture (AAPCS), and ARM and Thumb C and C++ compilers always use a full descending stack. The PUSH and POP instructions assume a full descending stack.


Friday, February 7, 2014

ARM Linux do not use TTBR1

Russell King - ARM Linux linux@arm.linux.org.uk via vger.kernel.org 

6/26/13
to WillYalinlinux-archlinux-kernellinux-arm-kern.
As I don't have the original mail (because it wasn't copied to the right
list) I can't reply to the original author, so I'll do it like this
instead.

On Wed, Jun 26, 2013 at 06:16:49PM +0100, Will Deacon wrote:
> [adding the ARM list -- please try and remember to do that in future]
>
> On Wed, Jun 26, 2013 at 03:41:40AM +0100, Wang, Yalin wrote:
> > Hi  Will,
>
> Hello,
>
> > I have a question about  arm pagetable setting in Linux .
> >
> > From armV6,  there is TTBR0 and TTBR1  translation base address registers  in mmu .
> > But I  found linux only use TTBR0 for translation base address ,
> > Could we use TTBR0 and TTBR1 to split user task and kernel pagetables (swapper_pg_dir)?
We don't use TTBR1 because the configurable page table splits between
TTBR0 and TTBR1 are not appropriate for Linux kernels.  The common
configuration is to have 3GB of userspace and 1GB of kernel space.

However, the TTBR splits supported are 2GB, 1GB, 512MB etc.  As I had
prior knowledge of ARMv6 before it was released, I raised this point
with ARM Ltd because I knew that it would not be appropriate for Linux.
Unfortunately, the response was basically that they didn't want to know.
So, as the hardware provided support mismatches what we want, we don't
use the feature.

It's as simple as that; had we been listened to and the architecture
altered to do what we required, then we'd be using it...

> > 1. Because we don’t need copy kernel first –level pagetables into every
> > User task’s pagetables and flush tlb (for example fork() a new process).
>
> Well, you still need the TLB maintenance for setting up CoW, so this win is
> probably not very big.
>
> > 2. And don’t need handle kernel page fault because that user task’s kernel
> > Pagetable when it is not set up , need copy again( for example vmalloc()  ioremap()  kmap()  will change
> > Kernel pagetables and need update to every task pagetables ) .
>
> Is that really a fastpath?
No it isn't, because for all of the above cases we're talking about copying
L1 page table entries, not the individual L2 page table entries between
threads.

Every page table above TASK_SIZE gets shared between processes, and once
it's been shared to a process, any new process forked from that gets its
own pointer to that 2nd level page table immediately.

So, during the initial boot there will be a number of the L1 copies, but
the system will stabilize and there will be no further L1 faulted copies
needed.

Wednesday, January 8, 2014

Link List - C implementation

#include
#include

struct node {
  int value;
struct node *next;
};

void add_to_tail(struct node **head, int value)
{
  struct node *temp, *i;

  temp = malloc(sizeof(struct node));
  temp->value = value;
  temp->next = NULL;

    /* list empty */
  if (*head == NULL) {
*head = temp;
return;
}

/* find the tail node */
for( i = *head ; i->next != NULL ; i = i->next);

i->next = temp;
}

void print_list(struct node *head)
{
  struct node *i = head;

  while (i != NULL) {
printf("%d\n", i->value);
i = i->next;
}
}

void reverse_list(struct node **head)
{
struct node *prev, *curr, *next;

prev = NULL;
curr = *head;

while (curr != NULL) {
next = curr->next;
curr->next = prev;
prev = curr;
curr = next;
}

*head = prev;
}

int main(void)
{
struct node *head = NULL;

add_to_tail(&head, 1);
add_to_tail(&head, 2);
add_to_tail(&head, 3);
add_to_tail(&head, 4);

print_list(head);

reverse_list(&head);

print_list(head);
}