Discussion:
Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II
Robert Schlabbach
2017-07-30 19:53:04 UTC
Permalink
Package: linux-image-4.11.0-0.bpo.1-marvell
Version: 4.11.6-1~bpo9+1: armel

I had Debian Stretch installed on my QNAP TS-219P II and already upgraded to the backports kernel linux-image-4.9.0-0.bpo.3-marvell (4.9.30-2+deb9u2~bpo8+1: armel). Upgrading that to the package specified above yielded a no-boot situation. The full output of the serial console:

__ __ _ _
| \/ | __ _ _ ____ _____| | |
| |\/| |/ _` | '__\ \ / / _ \ | |
| | | | (_| | | \ V / __/ | |
|_| |_|\__,_|_| \_/ \___|_|_|
_ _ ____ _
| | | | | __ ) ___ ___ | |_
| | | |___| _ \ / _ \ / _ \| __|
| |_| |___| |_) | (_) | (_) | |_
\___/ |____/ \___/ \___/ \__| ** LOADER **
** MARVELL BOARD: DB-88F6282A-BP LE TS-219P2+ ,PHY=1.8v

U-Boot 1.1.4 (Jan 3 2012 - 14:49:37) Marvell version: 3.5.3

U-Boot code: 00600000 -> 0067FFF0 BSS: -> 006CD5C0

Soc: MV88F6282 Rev 1CPU running @ 2000Mhz L2 running @ 500Mhz
SysClock = 500Mhz , TClock = 200Mhz

DRAM (DDR3) CAS Latency = 7 tRP = 7 tRAS = 20 tRCD=7
DRAM CS[0] base 0x00000000 size 256MB
DRAM CS[1] base 0x10000000 size 256MB
DRAM Total size 512MB 16bit width
Addresses 8M - 0M are saved for the U-Boot usage.
Mem malloc Initialization (8M - 7M): Done
[***@f8000000] Flash: 16 MB

CPU : Marvell Feroceon (Rev 1)
USB 0: host mode
PEX 0: PCI Express Root Complex Interface
PEX interface detected Link X1
PEX 1: PCI Express Root Complex Interface
PEX interface detected Link X1

Reset IDE:
Marvell Serial ATA Adapter
Integrated Sata device found

Net: egiga0 [PRIME]
Hit any key to stop autoboot: 0
QNAP: Recovery Button pressed: 0
Marvell>> boot
Send Cmd : 0x68 to UART1
## Booting image at 00800000 ...
Image Name: kernel 4.11.0-0.bpo.1-marvell
Created: 2017-07-30 18:55:56 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2076472 Bytes = 2 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.



That's it, no further output, no boot. The kernel seems to die early on.

I've observed such early failures on another platform when a "bad" device tree binary was flashed. Is it possible that kernel 4.11 requires a changed dtb, and that was not correctly upgraded?
Robert Schlabbach
2017-07-31 15:44:37 UTC
Permalink
I just noticed the line

[    0.267305] Unpacking initramfs...
[    0.270704] Initramfs unpacking failed: junk in compressed archive
[    0.304492] Freeing initrd memory: 9216K

in the output above, so I tried using the same initrd.img from flash with kernel 4.9 and kernel 4.11:

Marvell>> tftpboot 0x800000 C0A80802.img-4.9
[...]
Marvell>> cp.l 0xf8400000 0xa00000 0x240000
Marvell>> setenv bootargs earlycon console=ttyS0,115200 root=/dev/ram initrd=0xa00000,0x900000 ramdisk=34816 coherent_pool=1M
Marvell>> bootm 0x800000
## Booting image at 00800000 ...
Image Name: kernel 4.9.0-3-marvell
Created: 2017-07-30 22:50:09 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2056688 Bytes = 2 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.9.0-3-marvell (debian-***@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 Debian 4.9.30-2+deb9u2 (2017-06-26)
[...]
[ 0.264506] Unpacking initramfs...
[ 0.596114] Freeing initrd memory: 9216K (c0a00000 - c1300000)
[...]
(initramfs) ls -al /lib/modules/
total 0
drwxr-xr-x 3 0 0 0 Jul 30 18:55 .
drwxr-xr-x 7 0 0 0 Jul 30 18:55 ..
drwxr-xr-x 3 0 0 0 Jul 30 18:55 4.11.0-0.bpo.1-marvell
(initramfs)

------------------------------------------------------------------------

Marvell>> tftpboot 0x800000 C0A80802.img-4.11-bpo
[...]
Marvell>> cp.l 0xf8400000 0xa00000 0x240000
Marvell>> setenv bootargs earlycon console=ttyS0,115200 root=/dev/ram initrd=0xa00000,0x900000 ramdisk=34816 coherent_pool=1M
Marvell>> bootm 0x800000
## Booting image at 00800000 ...
Image Name: kernel 4.11.0-0.bpo.1-marvell
Created: 2017-07-30 23:17:11 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2076472 Bytes = 2 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.11.0-0.bpo.1-marvell (debian-***@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 Debian 4.11.6-1~bpo9+1 (2017-07-09)
[...]
[ 0.267305] Unpacking initramfs...
[ 0.270704] Initramfs unpacking failed: junk in compressed archive
[ 0.304492] Freeing initrd memory: 9216K

------------------------------------------------------------------------

So it turns out that kernel 4.9 is able to unpack the initrd.img for kernel 4.11, while kernel 4.11 can NOT unpack the same initrd.img.

Has the compression/format of initrd.img changed between kernels 4.9 and 4.11? Do the initramfs-tools need to be updated to create valid initrd.imgs for kernel 4.11? Maybe there was "only" a dependency missing when upgrading to kernel 4.11?

How would I convert the "bad" initrd.img to the proper format that kernel 4.11 can decompress...?
Robert Schlabbach
2017-07-31 10:00:43 UTC
Permalink
I was able to get debug output from the failed kernel boot by adding the "earlycon" bootarg:

Marvell>> setenv bootargs earlycon console=ttyS0,115200 root=/dev/ram initrd=0xa00000,0x900000 ramdisk=34816 coherent_pool=1M
Marvell>> boot
Send Cmd : 0x68 to UART1
## Booting image at 00800000 ...
Image Name: kernel 4.11.0-0.bpo.1-marvell
Created: 2017-07-30 18:55:56 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2076472 Bytes = 2 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.11.0-0.bpo.1-marvell (debian-***@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 Debian 4.11.6-1~bpo9+1 (2017-07-09)
[ 0.000000] CPU: Feroceon 88FR131 [56251311] revision 1 (ARMv5TE), cr=0005397f
[ 0.000000] CPU: VIVT data cache, VIVT instruction cache
[ 0.000000] OF: fdt: Machine model: QNAP TS219 family
[ 0.000000] earlycon: ns16550a0 at MMIO 0xf1012000 (options '')
[ 0.000000] bootconsole [ns16550a0] enabled
[ 0.000000] Memory policy: Data cache writeback
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 130048
[ 0.000000] Kernel command line: earlycon console=ttyS0,115200 root=/dev/ram initrd=0xa00000,0x900000 ramdisk=34816 coherent_pool=1M
[ 0.000000] PID hash table entries: 2048 (order: 1, 8192 bytes)
[ 0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[ 0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[ 0.000000] Memory: 502648K/524288K available (4096K kernel code, 398K rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K cma-reserved, 0K highmem)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB)
[ 0.000000] fixmap : 0xffc00000 - 0xfff00000 (3072 kB)
[ 0.000000] vmalloc : 0xe0800000 - 0xff800000 ( 496 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xe0000000 ( 512 MB)
[ 0.000000] pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB)
[ 0.000000] modules : 0xbf000000 - 0xbfe00000 ( 14 MB)
[ 0.000000] .text : 0xc0008000 - 0xc0500000 (5088 kB)
[ 0.000000] .init : 0xc0700000 - 0xc0800000 (1024 kB)
[ 0.000000] .data : 0xc0800000 - 0xc0863940 ( 399 kB)
[ 0.000000] .bss : 0xc0863940 - 0xc08a1a20 ( 249 kB)
[ 0.000000] NR_IRQS:16 nr_irqs:16 16
[ 0.000000] clocksource: orion_clocksource: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 9556302233 ns
[ 0.000006] sched_clock: 32 bits at 200MHz, resolution 5ns, wraps every 10737418237ns
[ 0.008120] Console: colour dummy device 80x30
[ 0.012582] Calibrating delay loop... 1974.27 BogoMIPS (lpj=3948544)
[ 0.034147] pid_max: default: 32768 minimum: 301
[ 0.038878] Security Framework initialized
[ 0.042982] Yama: disabled by default; enable with sysctl kernel.yama.*
[ 0.049673] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.056280] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.063979] CPU: Testing write buffer coherency: ok
[ 0.068919] ftrace: allocating 17111 entries in 34 pages
[ 0.096090] Setting up static identity map for 0x100000 - 0x10003c
[ 0.102442] mvebu-soc-id: MVEBU SoC ID=0x6282, Rev=0x1
[ 0.109175] devtmpfs: initialized
[ 0.114763] VFP support v0.3: not present
[ 0.118905] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.128630] futex hash table entries: 256 (order: -1, 3072 bytes)
[ 0.134879] pinctrl core: initialized pinctrl subsystem
[ 0.140845] NET: Registered protocol family 16
[ 0.146052] DMA: preallocated 1024 KiB pool for atomic coherent allocations
[ 0.153770] cpuidle: using governor ladder
[ 0.157881] cpuidle: using governor menu
[ 0.162020] Feroceon L2: Enabling L2
[ 0.165627] Feroceon L2: Cache support initialised.
[ 0.170662] [Firmware Info]: /***@f1000000/ethernet-***@72000/ethernet0-***@0: local-mac-address is not set
[ 0.183533] No ATAGs?
[ 0.186478] clocksource: Switched to clocksource orion_clocksource
[ 0.209676] VFS: Disk quotas dquot_6.6.0
[ 0.213668] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[ 0.225645] NET: Registered protocol family 2
[ 0.230581] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.237651] TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.244118] TCP: Hash tables configured (established 4096 bind 4096)
[ 0.250559] UDP hash table entries: 256 (order: 0, 4096 bytes)
[ 0.256390] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[ 0.262750] NET: Registered protocol family 1
[ 0.267280] Unpacking initramfs...
[ 0.270680] Initramfs unpacking failed: junk in compressed archive
[ 0.304469] Freeing initrd memory: 9216K
[ 0.308791] audit: initializing netlink subsys (disabled)
[ 0.314507] audit: type=2000 audit(0.264:1): state=initialized audit_enabled=0 res=1
[ 0.322241] workingset: timestamp_bits=30 max_order=17 bucket_order=0
[ 0.328728] zbud: loaded
[ 0.332709] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[ 0.340185] io scheduler noop registered
[ 0.344152] io scheduler cfq registered (default)
[ 0.349547] Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
[ 0.356757] libphy: Fixed MDIO Bus: probed
[ 0.360995] i2c /dev entries driver
[ 0.364729] kirkwood-cpufreq kirkwood-cpufreq: Unable to get cpuclk
[ 0.370996] kirkwood-cpufreq: probe of kirkwood-cpufreq failed with error -22
[ 0.378254] ledtrig-cpu: registered to indicate activity on CPUs
[ 0.384848] registered taskstats version 1
[ 0.388973] zswap: loaded using pool lzo/zbud
[ 0.393986] hctosys: unable to open rtc device (rtc0)
[ 1.106495] random: fast init done
[ 3.087516] Unable to handle kernel paging request at virtual address e0000004
[ 3.094706] pgd = c0004000
[ 3.097391] [e0000004] *pgd=00000000
[ 3.100948] Internal error: Oops: 5 [#1] ARM
[ 3.105194] Modules linked in:
[ 3.108230] CPU: 0 PID: 1 Comm: swapper Not tainted 4.11.0-0.bpo.1-marvell #1 Debian 4.11.6-1~bpo9+1
[ 3.117307] Hardware name: Marvell Kirkwood (Flattened Device Tree)
[ 3.123537] task: df43f6a0 task.stack: df4a0000
[ 3.128043] PC is at crc32_be+0xac/0x160
[ 3.131943] LR is at 0xe0000000
[ 3.135064] pc : [<c03149e0>] lr : [<e0000000>] psr: 80000053
[ 3.135064] sp : df4a1ee0 ip : 00000007 fp : 00000000
[ 3.146476] r10: c0730838 r9 : 00000000 r8 : c0863940
[ 3.151669] r7 : 00000000 r6 : bfafa168 r5 : b2d5e10f r4 : c0a5c3b0
[ 3.158158] r3 : c052b470 r2 : b184c190 r1 : f0defde0 r0 : 0827764a
[ 3.164647] Flags: Nzcv IRQs on FIQs off Mode SVC_32 ISA ARM Segment none
[ 3.171826] Control: 0005397f Table: 00004000 DAC: 00000053
[ 3.177537] Process swapper (pid: 1, stack limit = 0xdf4a0190)
[ 3.183334] Stack: (0xdf4a1ee0 to 0xdf4a2000)
[ 3.187667] 1ee0: c0a5c3b0 f0defde7 c089db70 c0730834 c0863940 c0729144 c0729114 c080302c
[ 3.195806] 1f00: ffffe000 c0101868 c0501d14 c080302c 00000011 c0863900 00000000 c0132ae0
[ 3.203936] 1f20: dfffee53 dfffee61 0000003a c0132bec c088a0d0 c01424f0 c061a0a4 0000003a
[ 3.212065] 1f40: 00000007 00000007 c061a52c c061a52c 00000000 6692af0e 0000003a 00000008
[ 3.220196] 1f60: 0000003a c0863940 c0730834 c0863940 c0749f88 c0700e84 00000007 00000007
[ 3.228326] 1f80: 00000000 c07005b8 00000000 c04bb468 00000000 00000000 00000000 00000000
[ 3.236456] 1fa0: 00000000 c04bb478 00000000 c0107610 00000000 00000000 00000000 00000000
[ 3.244586] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 3.252716] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000002
[ 3.260860] [<c03149e0>] (crc32_be) from [<c0729144>] (of_fdt_raw_init+0x30/0x78)
[ 3.268308] [<c0729144>] (of_fdt_raw_init) from [<c0101868>] (do_one_initcall+0x14c/0x180)
[ 3.276534] [<c0101868>] (do_one_initcall) from [<c0700e84>] (kernel_init_freeable+0x1c8/0x20c)
[ 3.285188] [<c0700e84>] (kernel_init_freeable) from [<c04bb478>] (kernel_init+0x10/0x10c)
[ 3.293414] [<c04bb478>] (kernel_init) from [<c0107610>] (ret_from_fork+0x14/0x24)
[ 3.300945] Code: e0230420 1afffff7 ebff8806 e8bd81f0 (e59e6004)
[ 3.307002] ---[ end trace 6cbbaa91fcb8bad3 ]---
[ 3.311688] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 3.311688]
[ 3.320775] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 3.320775]
[ 53.658504] random: crng init done

The error appears to be:

[ 3.087516] Unable to handle kernel paging request at virtual address e0000004
[ 3.094706] pgd = c0004000
[ 3.097391] [e0000004] *pgd=00000000
[ 3.100948] Internal error: Oops: 5 [#1] ARM
Robert Schlabbach
2017-07-31 00:43:41 UTC
Permalink
I made a little mistake in the description above: The kernel I had on my QNAP TS-219P II before was linux-image-4.9.0-3-marvell_4.9.30-2+deb9u2 - and this appears to be the last version that boots.

I've tried loading mkimage'd kernels+dtbs via tftpd into my broken QNAP TS-219P II from these packages and _none_ of these boots:

linux-image-4.11.0-0.bpo.1-marvell_4.11.6-1~bpo9+1_armel.deb
linux-image-4.11.0-1-marvell_4.11.6-1_armel.deb
linux-image-4.11.0-2-marvell_4.11.11-1+b1_armel.deb

when I do the same procedure using the kernel+dtb from linux-image-4.9.0-3-marvell_4.9.30-2+deb9u2_armel.deb, it does boot. Not fully, since the flashed initramfs does not match the kernel version, but at least it doesn't die immediately like all the 4.11 kernel builds do.
Robert Schlabbach
2017-07-31 16:10:08 UTC
Permalink
Ok, I figured it out. I noticed that the 4.11 kernel has a more "generous" memory layout than the 4.9 one:

kernel 4.9:

[ 0.000000] Memory: 504492K/524288K available (3777K kernel code, 371K rwdata, 1128K rodata, 296K init, 247K bss, 19796K reserved, 0K cma-reserved, 0K highmem)

kernel 4.11:

[ 0.000000] Memory: 502648K/524288K available (4096K kernel code, 398K rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K cma-reserved, 0K highmem)

So I suspected that the 4.11 kernel might be overwriting/corrupting the initrd.img provided in memory before it gets to unpack it, and changed the memory location from 0xa00000 to 0xc00000:

Marvell>> tftpboot 0x800000 C0A80802.img-4.11-bpo
[...]
Marvell>> cp.l 0xf8400000 0xc00000 0x240000
Marvell>> setenv bootargs earlycon console=ttyS0,115200 root=/dev/ram initrd=0xc00000,0x900000 ramdisk=34816 coherent_pool=1M
Marvell>> bootm 0x800000
## Booting image at 00800000 ...
Image Name: kernel 4.11.0-0.bpo.1-marvell
Created: 2017-07-30 23:17:11 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2076472 Bytes = 2 MB
Load Address: 00008000
Entry Point: 00008000
Verifying Checksum ... OK
OK

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.11.0-0.bpo.1-marvell (debian-***@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 Debian 4.11.6-1~bpo9+1 (2017-07-09)
[...]
[ 0.267272] Unpacking initramfs...
[ 0.597766] Freeing initrd memory: 9216K
[...]
Welcome to Debian GNU/Linux 9 (stretch)!

Voila! It's finally booting!

So, was the 4.11 kernel compiled/linked with a wrong alignment padding setting? Or should the bootloader environment be changed to permanently use the higher address for passing initrd.img to the kernel?
Loading...