Friday, November 15, 2013

Ram Dump - Linux System

Ram Dump
=========
Used for forensic analysis of a crashed or freeze kernel/system.
As the name points, its the dumping of complete ram contents for further investigation.

Divided into three sections
===================
The whole design is divided into three sections based on how ramdump is entered, transferred and stored.
1) Support in kernel to enter into ramdump mode when kernel crashes or freezes.
2) Support in boot-loader to detect ramdump, wait for a host tool to initiate ramdump and send the data to host.
3) A host tool which can receive data from device and save it on host PC file system.

Boot-loader Support
===============
During booting, boot-loader has to decide whether it has to enter in to ramdump mode or to boot linux normally. This is done by monitoring a magic value in any predefined location(typically SRAM register or any other, value should persists across reboot or reset). There are two magic numbers for example 0x12345678(for crash scenario) and  0x87654321(for freeze scenario). If the SRAM register is having any of the two magic number, the booloader will enter into ramdump mode and wait for the host side application to initiate upload. And if SRAM is not having any of these magic numbers, the bootloader will proceed with normal booting.

Kernel Support
===============
Kernel has to handle panic or freeze scenario and manage the magic numbers to be written in SRAM register.

Ramdump driver init
-----------------------
Write magic number for freeze scenario(MAGIC_FREEZE) in SRAM register.
Register for reboot and panic kernel notifiers.

During a freeze
-----------------
Kernel cannot do much here as the kernel is freezed. User has to press Power ON key for 10 sec, on which device will reset by PMU. While booting back,  boot-loader sees MAGIC_FREEZE in SRAM register(which was written during init) and enters to ramdump mode.

During a crash
------------------
Panic notifier registered from the ramdump driver will be called by kernel. Ramdump driver has to do few things.
a) Invoke cache sycn api to make sure all the content the ram is coherent.(flush_cache_all(),    outer_flush_all())
b) save core registers to ram
c) save MMU registers
d) write MAGIC_CRASH to SRAM regiter.
Then do a reboot. Now boot-loader finds the MAGIC_CRASH in SRAM register and goes to ramdump mode.

During a normal reboot
-------------------------
Device should not enter in to ramdump mode now.
Reboot notifier registered from the ramdump driver will be called by kernel. Ramdump driver clears MAGIC_FREEZE( written during the init ) and returns.

Saving pt_reg is done in two places,
File: arch/arm/mm/fault.c call the ramdump driver with pt_reg as argument to save for crashed cpu.
File: arch/arm/kernel/smp.c call ramdump driver with pt_reg from IPI_CPU_STOP to save the context of other cpus.

HOST side tool
============
Host side tool receive the data and saves to file system. We exploited fastboot to upload the data.

Design considerations
================
After a crash, device reboots and bootloader will be loaded again to RAM, which corrupts some part of the ram. This is something which cannot be avoided, unless ram upload is implemented in kernel itself. Another approach is to load the bootloader to a location which is of not much importance. For example, load bootloader to framebuffer location or some memory used by video subsystem or some carveout memory in the system.

U-boot relocates itself to the end of the ram. This behavior causes harm to the contents of the memory. So this should be taken care off.

Analyzing ramdump
==============
Ramdump can be analysed using T32 simulator or use amazing crash utility.
You need the corresponding vmlinux.

2 comments:

Sumit Singh said...

Thanks for this very useful information.

Swami said...

Any notes on how to take this ram dump for kernel freeze/restart issue?