My workload is a raid5 which had 16 disks. And used our filesystem to
write using direct-io mode.
I used the blktrace to find those message:
8,16 0 6647 2.
453665504 2579 M W 7493152 + 8 [md0_raid5]
8,16 0 6648 2.
453672411 2579 Q W 7493160 + 8 [md0_raid5]
8,16 0 6649 2.
453672606 2579 M W 7493160 + 8 [md0_raid5]
8,16 0 6650 2.
453679255 2579 Q W 7493168 + 8 [md0_raid5]
8,16 0 6651 2.
453679441 2579 M W 7493168 + 8 [md0_raid5]
8,16 0 6652 2.
453685948 2579 Q W 7493176 + 8 [md0_raid5]
8,16 0 6653 2.
453686149 2579 M W 7493176 + 8 [md0_raid5]
8,16 0 6654 2.
453693074 2579 Q W 7493184 + 8 [md0_raid5]
8,16 0 6655 2.
453693254 2579 M W 7493184 + 8 [md0_raid5]
8,16 0 6656 2.
453704290 2579 Q W 7493192 + 8 [md0_raid5]
8,16 0 6657 2.
453704482 2579 M W 7493192 + 8 [md0_raid5]
8,16 0 6658 2.
453715016 2579 Q W 7493200 + 8 [md0_raid5]
8,16 0 6659 2.
453715247 2579 M W 7493200 + 8 [md0_raid5]
8,16 0 6660 2.
453721730 2579 Q W 7493208 + 8 [md0_raid5]
8,16 0 6661 2.
453721974 2579 M W 7493208 + 8 [md0_raid5]
8,16 0 6662 2.
453728202 2579 Q W 7493216 + 8 [md0_raid5]
8,16 0 6663 2.
453728436 2579 M W 7493216 + 8 [md0_raid5]
8,16 0 6664 2.
453734782 2579 Q W 7493224 + 8 [md0_raid5]
8,16 0 6665 2.
453735019 2579 M W 7493224 + 8 [md0_raid5]
8,16 0 6666 2.
453741401 2579 Q W 7493232 + 8 [md0_raid5]
8,16 0 6667 2.
453741632 2579 M W 7493232 + 8 [md0_raid5]
8,16 0 6668 2.
453748148 2579 Q W 7493240 + 8 [md0_raid5]
8,16 0 6669 2.
453748386 2579 M W 7493240 + 8 [md0_raid5]
8,16 0 6670 2.
453851843 2579 I W 7493144 + 104 [md0_raid5]
8,16 0 0 2.
453853661 0 m N cfq2579 insert_request
8,16 0 6671 2.
453854064 2579 I W 7493120 + 24 [md0_raid5]
8,16 0 0 2.
453854439 0 m N cfq2579 insert_request
8,16 0 6672 2.
453854793 2579 U N [md0_raid5] 2
8,16 0 0 2.
453855513 0 m N cfq2579 Not idling.st->count:1
8,16 0 0 2.
453855927 0 m N cfq2579 dispatch_insert
8,16 0 0 2.
453861771 0 m N cfq2579 dispatched a request
8,16 0 0 2.
453862248 0 m N cfq2579 activate rq,drv=1
8,16 0 6673 2.
453862332 2579 D W 7493120 + 24 [md0_raid5]
8,16 0 0 2.
453865957 0 m N cfq2579 Not idling.st->count:1
8,16 0 0 2.
453866269 0 m N cfq2579 dispatch_insert
8,16 0 0 2.
453866707 0 m N cfq2579 dispatched a request
8,16 0 0 2.
453867061 0 m N cfq2579 activate rq,drv=2
8,16 0 6674 2.
453867145 2579 D W 7493144 + 104 [md0_raid5]
8,16 0 6675 2.
454147608 0 C W 7493120 + 24 [0]
8,16 0 0 2.
454149357 0 m N cfq2579 complete rqnoidle 0
8,16 0 6676 2.
454791505 0 C W 7493144 + 104 [0]
8,16 0 0 2.
454794803 0 m N cfq2579 complete rqnoidle 0
8,16 0 0 2.
454795160 0 m N cfq schedule dispatch
From above messages,we can find rq[W 7493144 + 104] and rq[W
7493120 + 24] do not merge.
Because the bio order is:
8,16 0 6638 2.
453619407 2579 Q W 7493144 + 8 [md0_raid5]
8,16 0 6639 2.
453620460 2579 G W 7493144 + 8 [md0_raid5]
8,16 0 6640 2.
453639311 2579 Q W 7493120 + 8 [md0_raid5]
8,16 0 6641 2.
453639842 2579 G W 7493120 + 8 [md0_raid5]
The bio(7493144) first and bio(7493120) later.So the subsequent
bios will be divided into two parts.
When flushing plug-list,because elv_attempt_insert_merge only support
backmerge,not supporting frontmerge.
So rq[7493120 + 24] can't merge with rq[7493144 + 104].
From my test,i found those situation can count 25% in our system.
Using this patch, there is no this situation.
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
CC:Shaohua Li <shli@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>