最近在改善程式讀檔的效能,發現這兩個函式可以好好利用,因為我都用buffered IO的API讀取資料,所以想藉由標題那兩位人兄幫我加速存取檔案的動作,根據Linux system programming和kernel的mm/readahead.c原始碼,都證明kernel會根據readahead API把目標檔擺到memory cache加速程式讀取目標檔的速度,而實際上真的這樣嗎?
我這邊做一個小小的實驗,分別用buffered IO+posix_advise, buffered IO+readahead和單獨使用buffered IO
[buffered IO+posix_advise]
real 0m0.240s
user 0m0.192s
sys 0m0.038s
[buffered IO+readahead]
real 0m0.238s
user 0m0.193s
sys 0m0.030s
[單獨使用buffered IO]
real 0m0.261s
user 0m0.200s
sys 0m0.044s
奇怪耶,好像都差不多?後來我用strace觀察readahead和緩衝式IO的執行流程
[buffered IO]
13:54:36 brk(0) = 0x8c66000
13:54:36 brk(0x8c87000) = 0x8c87000
13:54:36 open(“sample2.dat”, O_RDONLY) = 3
13:54:36 fstat64(3, {st_mode=S_IFREG|0664, st_size=16888890, …}) = 0
13:54:36 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fe4000
13:54:36 read(3, “sample0=136\nsample1=136\nsample2=”…, 4096) = 4096
13:54:36 close(3) = 0
13:54:36 munmap(0xb7fe4000, 4096) = 0
13:54:36 exit_group(0) = ?
[readahead]
14:02:38 open(“sample3.dat”, O_RDONLY) = 3
14:02:38 readahead(3, 16888890, 0) = 0
14:02:38 fcntl64(3, F_GETFL) = 0 (flags O_RDONLY)
14:02:38 brk(0) = 0x8a34000
14:02:38 brk(0x8a55000) = 0x8a55000
14:02:38 fstat64(3, {st_mode=S_IFREG|0664, st_size=16888890, …}) = 0
14:02:38 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f07000
14:02:38 _llseek(3, 0, [0], SEEK_CUR) = 0
14:02:38 read(3, “sample0=136\nsample1=136\nsample2=”…, 4096) = 4096
14:02:38 close(3) = 0
14:02:38 munmap(0xb7f07000, 4096) = 0
14:02:38 exit_group(0) = ?
readahead好像還比buffered IO多了一行_llseek的呼叫,那到這邊我有點疑惑,這樣效能上有什麼差別?
後來我查了一些kernel source,我發現預讀這個動作並不是”馬上就做”,而是在I/O排程內根據cache的使用情況,先把部份的file data擺進cache,而cache swapping時會根據使用者提供的資訊(posix_advise,readahead和檔案使用頻率)決定要不要換掉cache內的資料,所以像readahead和posix_advise並不是馬上服用,一針見效的API,但根據這樣的結果,預讀的動作還是越早做越好,畢竟你不能確定什麼時候kernel才會把你想讀的東西擺進cache
留言