程式師世界 >> 編程語言 >> C語言 >> C++ >> C++入門知識 >> C/C++的內存洩漏檢測工具Valgrind memcheck的使用經歷（求大神解答疑惑，找出內存洩露真凶）

C/C++的內存洩漏檢測工具Valgrind memcheck的使用經歷（求大神解答疑惑，找出內存洩露真凶）

編輯：C++入門知識

1. 最多最低級的錯誤：不匹配地使用malloc/new/new[] 和 free/delete/delete[] 這樣的錯誤主要源於我對C++的new/new[]、delete/delete[]機制不熟悉，凡是new/new[]分配內存的類型變量我一概用delete進行釋放，或者有的變量用malloc進行分配，結果釋放的時候卻用delete，導致申請、釋放很多地方不匹配，很多內存空間沒能釋放掉。為了維護方便，我後來一律使用new/new[]和delete/delete[]，拋棄C中的malloc和free。如果將用戶new的類型分為基本數據類型和自定義數據類型兩種，那麼對於下面的操作相信大家都很熟悉，也沒有任何問題。（1）基本數據類型一維指針： // 申請空間 int *d = new int[5]; // 釋放空間 delete[] d; 二維指針: 復制代碼 // 申請空間 int **d = new int*[5]; for (int i = 0; i < 5; i++) d[i] = new int[10]; // 釋放空間 for (int i = 0; i < 5; i++) delete[] d[i]; delete[] d; 復制代碼（2）自定義數據類型比如下面這樣一個類型：復制代碼 class DFA { bool is_mark; char *s; public: ~DFA() { printf("delete it.\n"); } }; 復制代碼一維指針： DFA *d = new DFA(); delete d; 二維指針：復制代碼 // 申請空間 DFA **d = new DFA*[5]; for (int i = 0; i < 5; i++) d[i] = new DFA(); // 釋放空間 for (int i = 0; i < 5; i++) delete d[i]; delete[]d; 復制代碼這沒有任何問題，因為我們都是配套使用new/delete和new[]/delete[]的。這在Valgrind下檢測也是完美通過的，但為什麼要這配套使用呢？原理是什麼？雖然深究這些東西好像沒什麼實際意義，但對於想深入了解C++內部機制或像我一樣老是釋放出錯導致大量內存洩露的小白程序員還是值得研究的，至少知道了為什麼，以後就不會犯現在的低級錯誤。參考文獻（3）是這樣描述的：通常狀況下，編譯器在new的時候會返回用戶申請的內存空間大小，但是實際上，編譯器會分配更大的空間，目的就是在delete的時候能夠准確的釋放這段空間。這段空間在用戶取得的指針之前以及用戶空間末尾之後存放。實際上：blockSize = sizeof(_CrtMemBlockHeader) + nSize + nNoMansLandSize; 其中，blockSize 是系統所分配的實際空間大小，_CrtMemBlockHeader是new的頭部信息，其中包含用戶申請的空間大小等其他一些信息。 nNoMansLandSize是尾部的越界校驗大小，一般是4個字節“FEFEFEFE”，如果用戶越界寫入這段空間，則校驗的時候會assert。nSize才是為我們分配的真正可用的內存空間。用戶new的時候分為兩種情況 A. new的是基礎數據類型或者是沒有自定義析構函數的結構 B. new的是有自定義析構函數的結構體或類這兩者的區別是如果有用戶自定義的析構函數，則delete的時候必須要調用析構函數，那麼編譯器delete時如何知道要調用多少個對象的析構函數呢，答案就是new的時候，如果是情況B，則編譯器會在new頭部之後，用戶獲得的指針之前多分配4個字節的空間用來記錄new的時候的數組大小，這樣delete的時候就可以取到個數並正確的調用。這段描述可能有些晦澀難懂，參考文獻（4）給了更加詳細的解釋，一點即通。這樣的解釋其實也隱含著一個推論：如果new的是基本數據類型或者是沒有自定義析構函數的結構，那麼這種情況下編譯器不會在用戶獲得的指針之前多分配4個字節，因為這時候delete時不用調用析構函數，也就是不用知道數組個數的大小（因為只有調用析構函數時才需要知道要調用多少個析構函數，也就是數組的大小），而是直接傳入數組的起始地址從而釋放掉這塊內存空間，此時delete與delete[]是等價的。因此下面的釋放操作也是正確的： // 申請空間 int *d = new int[5]; // 釋放空間 delete d; 將其放在Valgrind下進行檢測，結果如下：復制代碼 ==2955== Memcheck, a memory error detector ==2955== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==2955== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==2955== Command: ./test_int ==2955== ==2955== Mismatched free() / delete / delete [] ==2955== at 0x402ACFC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==2955== by 0x8048530: main (in /home/hadoop/test/test_int) ==2955== Address 0x434a028 is 0 bytes inside a block of size 20 alloc'd ==2955== at 0x402B774: operator new[](unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==2955== by 0x8048520: main (in /home/hadoop/test/test_int) ==2955== ==2955== ==2955== HEAP SUMMARY: ==2955== in use at exit: 0 bytes in 0 blocks ==2955== total heap usage: 1 allocs, 1 frees, 20 bytes allocated ==2955== ==2955== All heap blocks were freed -- no leaks are possible ==2955== ==2955== For counts of detected and suppressed errors, rerun with: -v ==2955== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) 復制代碼首先從“All heap blocks were freed -- no leaks are possible”可以看出上面的釋放操作的確是正確的，而不是有些人認為的delete d;只會釋放d[]的第一個元素的空間，後面的都不會得到釋放。但是從“Mismatched free() / delete / delete []”知道Valgrind實際上是不允許這樣操作的，雖然沒有內存洩露問題，但是new[]與delete不匹配，這樣的編程風格不經意間就容易犯低級錯誤，所以Valgrind報錯了，但是我想Valgrind內部實現應該不會考慮的這麼復雜，它就檢查new是否與delete配對，new[]是否與delete[]配對，而不管有時候new[]與delete配對也不會出現問題的。綜上所述，給我的經驗就是：在某些情況下，new[]分配的內存用delete不會出錯，但是大多情況下會產生嚴重的內存問題，所以一定要養成將new和delete，new[]和delete[]配套使用的良好編程習慣。 2. 最看不懂的錯誤：一堆看不懂的Invalid read/write錯誤比如下面這樣一個程序：復制代碼 #include <stdio.h> #include <string.h> #include <stdlib.h> struct accept_pair { bool is_accept_state; bool is_strict_end; char app_name[0]; }; int main() { char *s = "Alexia"; accept_pair *ap = (accept_pair*)malloc(sizeof(accept_pair) + sizeof(s)); strcpy(ap->app_name, s); printf("app name: %s\n", ap->app_name); free(ap); return 0; } 復制代碼首先對該程序做個扼要的說明：這裡結構體裡定義零長數組的原因在於我的需求：我在其它地方要用到很大的accept_pair數組，其中只有個別accept_pair元素中的app_name是有效的（取決於某些值的判斷，如果為true才給app_name賦值，如果為false則app_name無意義，為空），因此若是char app_name[20]，那麼大部分accept_pair元素都浪費了這20個字節的空間，所以我在這裡先一個字節都不分配，到時誰需要就給誰分配，遵循“按需分配”的古老思想。可能有人會想，用char *app_name也可以啊，同樣能實現按需分配，是的，只是多4個字節而已，屬於替補方法。在g++下經過測試，沒有什麼問題，能夠正確運行，但用Valgrind檢測時卻報出了一些錯誤，不是內存洩露問題，而是內存讀寫錯誤：復制代碼 ==3511== Memcheck, a memory error detector ==3511== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==3511== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==3511== Command: ./zero ==3511== ==3511== Invalid write of size 1 ==3511== at 0x402CD8B: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484E3: main (in /home/hadoop/test/zero) ==3511== Address 0x420002e is 0 bytes after a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== ==3511== Invalid write of size 1 ==3511== at 0x402CDA5: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484E3: main (in /home/hadoop/test/zero) ==3511== Address 0x4200030 is 2 bytes after a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== ==3511== Invalid read of size 1 ==3511== at 0x40936A5: vfprintf (vfprintf.c:1655) ==3511== by 0x409881E: printf (printf.c:34) ==3511== by 0x4063934: (below main) (libc-start.c:260) ==3511== Address 0x420002e is 0 bytes after a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== ==3511== Invalid read of size 1 ==3511== at 0x40BC3C0: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1311) ==3511== by 0x4092184: vfprintf (vfprintf.c:1655) ==3511== by 0x409881E: printf (printf.c:34) ==3511== by 0x4063934: (below main) (libc-start.c:260) ==3511== Address 0x420002f is 1 bytes after a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== ==3511== Invalid read of size 1 ==3511== at 0x40BC3D7: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1311) ==3511== by 0x4092184: vfprintf (vfprintf.c:1655) ==3511== by 0x409881E: printf (printf.c:34) ==3511== by 0x4063934: (below main) (libc-start.c:260) ==3511== Address 0x420002e is 0 bytes after a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== ==3511== Invalid read of size 4 ==3511== at 0x40C999C: __GI_mempcpy (mempcpy.S:59) ==3511== by 0x40BC310: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1329) ==3511== by 0x4092184: vfprintf (vfprintf.c:1655) ==3511== by 0x409881E: printf (printf.c:34) ==3511== by 0x4063934: (below main) (libc-start.c:260) ==3511== Address 0x420002c is 4 bytes inside a block of size 6 alloc'd ==3511== at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3511== by 0x80484C8: main (in /home/hadoop/test/zero) ==3511== app name: Alexia ==3511== ==3511== HEAP SUMMARY: ==3511== in use at exit: 0 bytes in 0 blocks ==3511== total heap usage: 1 allocs, 1 frees, 6 bytes allocated ==3511== ==3511== All heap blocks were freed -- no leaks are possible ==3511== ==3511== For counts of detected and suppressed errors, rerun with: -v ==3511== ERROR SUMMARY: 9 errors from 6 contexts (suppressed: 0 from 0) 復制代碼從檢測報告可以看出： strcpy(ap->app_name, s);這句是內存寫錯誤，printf("app name: %s\n", ap->app_name);這句是內存讀錯誤，兩者都說明Valgrind認為ap->app_name所處內存空間是不合法的，可是我明明已經為其分配了內存空間，只是沒有注明這段空間就是給它用的，難道結構體中零長數組char app_name[0]是不能寫入值的嗎？還是我對零長數組的使用有誤？至今仍不得解，求大神解答…… 3. 最不明所以的內存洩露：definitely lost/indefinitely lost 請看下面這樣一個程序：復制代碼 #include <stdio.h> #include <string.h> class accept_pair { public: bool is_accept_state; bool is_strict_end; char *app_name; public: accept_pair(bool is_accept = false, bool is_end = false); ~accept_pair(); }; class DFA { public: unsigned int _size; accept_pair **accept_states; public: DFA(int size); ~DFA(); void add_state(int index, char *s); void add_size(int size); }; int main() { char *s = "Alexia"; DFA *dfa = new DFA(3); dfa->add_state(0, s); dfa->add_state(1, s); dfa->add_state(2, s); dfa->add_size(2); dfa->add_state(3, s); dfa->add_state(4, s); printf("\napp_name: %s\n", dfa->accept_states[4]->app_name); printf("size: %d\n\n", dfa->_size); delete dfa; return 0; } accept_pair::accept_pair(bool is_accept, bool is_end) { is_accept_state = is_accept; is_strict_end = is_end; app_name = NULL; } accept_pair::~accept_pair() { if (app_name) { printf("delete accept_pair.\n"); delete[] app_name; } } DFA::DFA(int size) { _size = size; accept_states = new accept_pair*[_size]; for (int s = 0; s < _size; s++) { accept_states[s] = NULL; } } DFA::~DFA() { for (int i = 0; i < _size; i++) { if (accept_states[i]) { printf("delete dfa.\n"); delete accept_states[i]; accept_states[i] = NULL; } } delete[] accept_states; } void DFA::add_state(int index, char *s) { accept_states[index] = new accept_pair(true, true); accept_states[index]->app_name = new char[strlen(s) + 1]; memcpy(accept_states[index]->app_name, s, strlen(s) + 1); } void DFA::add_size(int size) { // reallocate memory for accept_states. accept_pair **tmp_states = new accept_pair*[size + _size]; for (int s = 0; s < size + _size; s++) tmp_states[s] = new accept_pair(false, false); for (int s = 0; s < _size; s++) { tmp_states[s]->is_accept_state = accept_states[s]->is_accept_state; tmp_states[s]->is_strict_end = accept_states[s]->is_strict_end; if (accept_states[s]->app_name != NULL) { tmp_states[s]->app_name = new char[strlen(accept_states[s]->app_name) + 1]; memcpy(tmp_states[s]->app_name, accept_states[s]->app_name, strlen(accept_states[s]->app_name) + 1); } } // free old memory. for (int s = 0; s < _size; s++) { if (accept_states[s] != NULL) { delete accept_states[s]; accept_states[s] = NULL; } } _size += size; delete []accept_states; accept_states = tmp_states; } 復制代碼雖然有點長，但邏輯很簡單，其中add_size()首先分配一個更大的accept_pair數組，將已有的數據全部拷貝進去，然後釋放掉原來的accept_pair數組所占空間，最後將舊的數組指針指向新分配的內存空間。這是個demo程序，在我看來這段程序是沒有任何內存洩露問題的，因為申請的所有內存空間最後都會在DFA析構函數中得到釋放。但是Valgrind的檢測報告卻報出了1個內存洩露問題（紅色的是程序輸出）：復制代碼 ==3093== Memcheck, a memory error detector ==3093== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==3093== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==3093== Command: ./test ==3093== delete accept_pair. delete accept_pair. delete accept_pair. app_name: Alexia size: 5 delete dfa. delete accept_pair. delete dfa. delete accept_pair. delete dfa. delete accept_pair. delete dfa. delete accept_pair. delete dfa. delete accept_pair. ==3093== ==3093== HEAP SUMMARY: ==3093== in use at exit: 16 bytes in 2 blocks ==3093== total heap usage: 21 allocs, 19 frees, 176 bytes allocated ==3093== ==3093== 16 bytes in 2 blocks are definitely lost in loss record 1 of 1 ==3093== at 0x402BE94: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==3093== by 0x8048A71: DFA::add_size(int) (in /home/hadoop/test/test) ==3093== by 0x8048798: main (in /home/hadoop/test/test) ==3093== ==3093== LEAK SUMMARY: ==3093== definitely lost: 16 bytes in 2 blocks ==3093== indirectly lost: 0 bytes in 0 blocks ==3093== possibly lost: 0 bytes in 0 blocks ==3093== still reachable: 0 bytes in 0 blocks ==3093== suppressed: 0 bytes in 0 blocks ==3093== ==3093== For counts of detected and suppressed errors, rerun with: -v ==3093== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)