C/C++ 拆分檔案時為什麼要分 header 跟 source?

2024-07-31 約 1563 字預計閱讀 4 分鐘閱讀次數：

在寫 C/C++ 自行拆分檔案的時候會把 header 跟 source 拆分，但其它語言（Python/Go/C#/Java）卻不需要，這跟他的編譯方式有關係，且聽我娓娓道來。

試著拆分檔案進行編譯 - 以基本運算為例

從 .cpp 原始碼檔轉為二進制可執行檔需要經過以下步驟：

預處理 (pre-processing)
編譯 (compilation)
彙編 (assembly)
鏈接 (linking)

當我們在試著把 C++ 中的 class 及 function 拆分到不同檔案時，會分別寫成 .h 跟 .cpp 檔

以檔案名稱為 functionset 為例，拆分成 functionset.cpp 和 functionset.h。其中 .h 檔案只寫聲明的部分，而不進行細節實作：

1
2
3
4
5
6
7


#ifndef FUNCTIONSET_H
#define FUNCTIONSET_H
int add(int a, int b);
int sub(int a, int b);
int multi(int a, int b);
int divi(int a, int b);
#endif // FUNCTIONSET_H

而 .cpp 檔的部分則要引入 functionset.h 檔案，並且進行具體實現：

1
2
3
4
5
6


#include "functionset.h"
int add(int a, int b) 
{
    return a + b;
}
// sub, multi, divi 略

最後我們可以在 main.cpp 中使用我們定義好的 add 函數：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


#include <iostream>
#include "functionset.h"

using namespace std;

int main()
{
    int c = add(3, 5);
    cout << c; // Output: 8
}

為何要拆分檔案

預處理 (pre-processing)

編譯 (compliation)

彙編 (assembly)

鏈接 (linking)

那麼為什麼要拆成 .h 和 .cpp 檔呢？因為在 1~3 尚未進行 linking 的階段時，每一個 .cpp 檔案都是獨立進行預處理、編譯、彙編的。（最後 linking 時再合併成整個程式）

而所謂 #include "xxx.x" 就是在預處理的階段將 xxx.x 的內容複製貼上到目前編譯的檔案中進行替換。

所以如果今天直接在 functionset.cpp 中進行函數的宣告與實現，然後在整個程式中所有用到 add 函數的地方進行 #include "functionset.cpp，就會在 linking 的時候產生 multi definition （重覆定義）的錯誤。

因為剛剛說到「include 相當於複製貼上」，所以我們不小心在每個 #include "functionset.cpp" 的地方都重新定義了一次同樣的函數。

C/C++ 獨立的 Declare 和 Define

在 C/C++ 中，一個函數的 declare 和 define 是分開的，這也是為什麼有時候你可以看到這種程式碼：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


#include <stdio.h>

int add(int a, int b); // declare
int main() 
{
    int c = add(3, 2); // use
    printf("%d", c);
    return 0;
}

int add(int a, int b) // define
{
    return a + b;   
}

要注意的是 C/C++ 中的程式碼由上而下執行，如果只把 add 函數的定義和宣告寫在 main 函數下方，是會產生編譯錯誤的。

但在上述例子中我們在 main 函數前先 declare add 函數（但還沒有具體實現），編譯器雖然還不知道 add 函數的具體行為，但因為有 declare 過，所以至少知道它是個「接收兩個 int，並且回傳 int」的函數，會預留空間給函數體使用，所以能夠成功通過編譯。

接著在鏈接期的時候才會把它跟下方才 define 的 add 函數體 link 在一起。

避免重複引用 ifndef define endif

現在已經知道我們把 declare 的部分寫在 functionset.h 中與 functionset.cpp 分開，是為了確保在整個 application 中只有 define 一次同樣的函數，那麼就該解釋一下為什麼 functionset.h 中會有：

1
2
3
4
5
6
7


#ifndef FUNCTIONSET_H
#define FUNCTIONSET_H
// 中間這裡寫函數 declare
// 中間這裡寫函數 declare
// 中間這裡寫函數 declare
// ...
#endif // FUNCTIONSET_H

當今天我們把程式拆分成多個檔案的時候，就沒辦法避免同一個檔案被多次使用的狀況。舉例我們寫一個 repeat 檔：

repeat.h:

1
2


#include <vector>
void my_func(std::vector<int> v);

repeat.cpp:

1
2
3
4
5
6


#include <vector>
#include "repeat.h"
void my_func(std::vector<int> v)
{
    // do something...
}

然後在 main.cpp 中我們除了用到 my_func 還會用到 vector。

main.cpp:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


#include <vector>
#include "repeat.h"
int main()
{
    // do something...
    // do something...
    // do something...
    std::vector<int> v;
    my_func(v);
    return 0;
}

現在請想像我們是預處理器。

當我們進行預處理的時候，main.cpp 中的第一行 #include <vector> 會將 vector 的 declares 引入。

而當我們處理 main.cpp 的第二行時，會需要把 repeat.h 引入

但仔細查看原始碼，repeat.h 中也引入了 vector。這就造成了光是編譯 main.cpp 這一個檔案就引入了好幾次 vector！照理講重複的引入應該會造成錯誤，所以當我們去查看 vector 的程式碼，就會發現以下幾行：

1
2
3
4
5


/// 略…
#ifndef _GLIBCXX_DEBUG_VECTOR
#define _GLIBCXX_DEBUG_VECTOR 1
// 略…
#endif

ifndef 是 if not defined 的縮寫，意思是判斷後方的指示詞（在此例中是 _GLIBCXX_DEBUG_VECTOR）有沒有被 define 過。

若沒有則執行內部的程式。
若指示詞已 define 則跳到 endif 的位置。

所以當某個檔案（例如 main.cpp）在預處理的時候第一次 #include <vector>，此時 _GLIBCXX_DEBUG_VECTOR 尚末被 define，就會執行：

#define _GLIBCXX_DEBUG_VECTOR 1
底下被 ifndef 和 endif 包住的 vector 相關的 declares

而當 main.cpp 第二次直接或間接 include 到 vector 的時候，此時 _GLIBCXX_DEBUG_VECTOR 已經被 define 過，預處理器就會直接跳到 endif，就可以避免重複引入兩次同樣的 declares。

Reference

如果你喜歡這篇文章，歡迎點擊下方的 Like 拍手給我鼓勵，LikeCoin 官方會再發放實質獎勵給創作者喔！（註冊 Liker 完全免費）

更新於 2024-07-31

閱讀原始文檔

程式, C++

返回 | 主頁

目錄