使用Ulrich Drepper的relinfo.pl
腳本,可以很容易地計算DSO重定位的次數,但它不適用於.o
文件。找到重定位的起始位置
假設我有一個很大的共享庫,我對它的重定位次數不滿意。有沒有辦法找出它們來自哪裏(符號,或至少.o
),以檢查它們是否是易於修復的類型(例如:const char * str = "Hello World";'
- >const char str[] = "Hello World";
)?
使用Ulrich Drepper的relinfo.pl
腳本,可以很容易地計算DSO重定位的次數,但它不適用於.o
文件。找到重定位的起始位置
假設我有一個很大的共享庫,我對它的重定位次數不滿意。有沒有辦法找出它們來自哪裏(符號,或至少.o
),以檢查它們是否是易於修復的類型(例如:const char * str = "Hello World";'
- >const char str[] = "Hello World";
)?
龍答:讓我們來看一個實際的例子情況下,example.c
:
#include <stdio.h>
static const char global1[] = "static const char []";
static const char *global2 = "static const char *";
static const char *const global3 = "static const char *const";
const char global4[] = "const char []";
const char *global5 = "const char *";
const char *const global6 = "const char *const";
char global7[] = "char []";
char *global8 = "char *";
char *const global9 = "char *const";
int main(void)
{
static const char local1[] = "static const char []";
static const char *local2 = "static const char *";
static const char *const local3 = "static const char *const";
const char local4[] = "const char []";
const char *local5 = "const char *";
const char *const local6 = "const char *const";
char local7[] = "char []";
char *local8 = "char *";
char *const local9 = "char *const";
printf("Global:\n");
printf("\t%s\n", global1);
printf("\t%s\n", global2);
printf("\t%s\n", global3);
printf("\t%s\n", global4);
printf("\t%s\n", global5);
printf("\t%s\n", global6);
printf("\t%s\n", global7);
printf("\t%s\n", global8);
printf("\t%s\n", global9);
printf("\n");
printf("Local:\n");
printf("\t%s\n", local1);
printf("\t%s\n", local2);
printf("\t%s\n", local3);
printf("\t%s\n", local4);
printf("\t%s\n", local5);
printf("\t%s\n", local6);
printf("\t%s\n", local7);
printf("\t%s\n", local8);
printf("\t%s\n", local9);
return 0;
}
你可以把它編譯成使用例如一個目標文件
gcc -W -Wall -c example.c
以及使用
gcc -W -Wall example.c -o example
可執行可以使用objdump -tr example.o
轉儲的(非動態)目標文件中的符號和重定位信息,或objdump -TtRr example
轉儲相同的可執行文件(和動態目標文件)。在X86-64我得到
example.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 example.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 l O .rodata 0000000000000015 global1
0000000000000000 l O .data 0000000000000008 global2
0000000000000048 l O .rodata 0000000000000008 global3
00000000000000c0 l O .rodata 0000000000000015 local1.2053
0000000000000020 l O .data 0000000000000008 local2.2054
00000000000000d8 l O .rodata 0000000000000008 local3.2055
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000050 g O .rodata 000000000000000e global4
0000000000000008 g O .data 0000000000000008 global5
0000000000000080 g O .rodata 0000000000000008 global6
0000000000000010 g O .data 0000000000000008 global7
0000000000000018 g O .data 0000000000000008 global8
00000000000000a0 g O .rodata 0000000000000008 global9
0000000000000000 g F .text 000000000000027a main
0000000000000000 *UND* 0000000000000000 puts
0000000000000000 *UND* 0000000000000000 printf
0000000000000000 *UND* 0000000000000000 putchar
0000000000000000 *UND* 0000000000000000 __stack_chk_fail
輸出在man 1 objdump
描述的那樣,-t
標題下使用
objdump -t example.o
。請注意,第二個「列」實際上是固定寬度:七個字符寬,描述對象的類型。第三列是代碼段名稱,*UND*
代表未定義,.text
代碼,.rodata
代表只讀(不可變)數據,.data
代表初始化的可變數據,.bss
代表未初始化的可變數據,等等。
我們可以從上面的符號表中看到,local4
,local5
,local6
,local7
,local8
和local9
變量實際上並沒有在符號表中獲得條目的。這是因爲他們在main()
。它們所指向的字符串的內容存儲在.data
或.rodata
(或即時構建)中,具體取決於編譯器認爲最佳的內容。
下面我們來看看重定位記錄。使用
objdump -r example.o
我得到
example.o: file format elf64-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000037 R_X86_64_32S .rodata+0x000000000000005e
0000000000000040 R_X86_64_32S .rodata+0x000000000000006b
0000000000000059 R_X86_64_32S .rodata+0x0000000000000088
0000000000000062 R_X86_64_32S .rodata+0x000000000000008f
0000000000000067 R_X86_64_32 .rodata+0x00000000000000a8
000000000000006c R_X86_64_PC32 puts-0x0000000000000004
0000000000000071 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000076 R_X86_64_32 .rodata
0000000000000083 R_X86_64_PC32 printf-0x0000000000000004
000000000000008a R_X86_64_PC32 .data-0x0000000000000004
000000000000008f R_X86_64_32 .rodata+0x00000000000000b0
000000000000009f R_X86_64_PC32 printf-0x0000000000000004
00000000000000a6 R_X86_64_PC32 .rodata+0x0000000000000044
00000000000000ab R_X86_64_32 .rodata+0x00000000000000b0
00000000000000bb R_X86_64_PC32 printf-0x0000000000000004
00000000000000c0 R_X86_64_32 .rodata+0x00000000000000b0
00000000000000c5 R_X86_64_32 global4
00000000000000d2 R_X86_64_PC32 printf-0x0000000000000004
00000000000000d9 R_X86_64_PC32 global5-0x0000000000000004
00000000000000de R_X86_64_32 .rodata+0x00000000000000b0
00000000000000ee R_X86_64_PC32 printf-0x0000000000000004
00000000000000f5 R_X86_64_PC32 global6-0x0000000000000004
00000000000000fa R_X86_64_32 .rodata+0x00000000000000b0
000000000000010a R_X86_64_PC32 printf-0x0000000000000004
000000000000010f R_X86_64_32 .rodata+0x00000000000000b0
0000000000000114 R_X86_64_32 global7
0000000000000121 R_X86_64_PC32 printf-0x0000000000000004
0000000000000128 R_X86_64_PC32 global8-0x0000000000000004
000000000000012d R_X86_64_32 .rodata+0x00000000000000b0
000000000000013d R_X86_64_PC32 printf-0x0000000000000004
0000000000000144 R_X86_64_PC32 global9-0x0000000000000004
0000000000000149 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000159 R_X86_64_PC32 printf-0x0000000000000004
0000000000000163 R_X86_64_PC32 putchar-0x0000000000000004
0000000000000168 R_X86_64_32 .rodata+0x00000000000000b5
000000000000016d R_X86_64_PC32 puts-0x0000000000000004
0000000000000172 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000177 R_X86_64_32 .rodata+0x00000000000000c0
0000000000000184 R_X86_64_PC32 printf-0x0000000000000004
000000000000018b R_X86_64_PC32 .data+0x000000000000001c
0000000000000190 R_X86_64_32 .rodata+0x00000000000000b0
00000000000001a0 R_X86_64_PC32 printf-0x0000000000000004
00000000000001a7 R_X86_64_PC32 .rodata+0x00000000000000d4
00000000000001ac R_X86_64_32 .rodata+0x00000000000000b0
00000000000001bc R_X86_64_PC32 printf-0x0000000000000004
00000000000001c1 R_X86_64_32 .rodata+0x00000000000000b0
00000000000001d6 R_X86_64_PC32 printf-0x0000000000000004
00000000000001db R_X86_64_32 .rodata+0x00000000000000b0
00000000000001ef R_X86_64_PC32 printf-0x0000000000000004
00000000000001f4 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000209 R_X86_64_PC32 printf-0x0000000000000004
000000000000020e R_X86_64_32 .rodata+0x00000000000000b0
0000000000000223 R_X86_64_PC32 printf-0x0000000000000004
0000000000000228 R_X86_64_32 .rodata+0x00000000000000b0
000000000000023d R_X86_64_PC32 printf-0x0000000000000004
0000000000000242 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000257 R_X86_64_PC32 printf-0x0000000000000004
0000000000000271 R_X86_64_PC32 __stack_chk_fail-0x0000000000000004
RELOCATION RECORDS FOR [.data]:
OFFSET TYPE VALUE
0000000000000000 R_X86_64_64 .rodata+0x0000000000000015
0000000000000008 R_X86_64_64 .rodata+0x000000000000005e
0000000000000018 R_X86_64_64 .rodata+0x0000000000000088
0000000000000020 R_X86_64_64 .rodata+0x0000000000000015
RELOCATION RECORDS FOR [.rodata]:
OFFSET TYPE VALUE
0000000000000048 R_X86_64_64 .rodata+0x0000000000000029
0000000000000080 R_X86_64_64 .rodata+0x000000000000006b
00000000000000a0 R_X86_64_64 .rodata+0x000000000000008f
00000000000000d8 R_X86_64_64 .rodata+0x0000000000000029
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
搬遷記錄由他們搬遷駐留在節編組。因爲字符串內容在.data
或.rodata
章節中,我們可以限制自己,看看搬遷其中VALUE
以.data
或.rodata
開頭。 (可變字符串,如char global7[] = "char []";
,存儲在.data
中,以及.rodata
中的不可變字符串和字符串文字。)
如果我們要編譯啓用了調試符號的代碼,可以更容易地確定哪個變量用於引用哪個字符串,但我可能只是查看每個重定位值(target)處的實際內容,看哪個引用以不可變的字符串需要修復。
發生最在最後重定位命令組合
objdump -r example.o | awk '($3 ~ /^\..*\+/) { t = $3; sub(/\+/, " ", t); n[t]++ } END { for (r in n) printf "%d %s\n", n[r], r }' | sort -g
將輸出每個目標重定位的數目,隨後由目標部分,之後該目標部分中的偏移量,與目標排序。也就是說,上面輸出的最後一行是你需要關注的。對於我來說,我得到
1 .rodata
1 .rodata 0x0000000000000044
1 .rodata 0x00000000000000a8
1 .rodata 0x00000000000000b5
1 .rodata 0x00000000000000c0
1 .rodata 0x00000000000000d4
2 .rodata 0x0000000000000015
2 .rodata 0x0000000000000029
2 .rodata 0x000000000000005e
2 .rodata 0x000000000000006b
2 .rodata 0x0000000000000088
2 .rodata 0x000000000000008f
18 .rodata 0x00000000000000b0
如果我添加優化(gcc -W -Wall -O3 -fomit-frame-pointer -c example.c
),其結果是
1 .rodata 0x0000000000000020
1 .rodata 0x0000000000000040
1 .rodata.str1.1
1 .rodata.str1.1 0x0000000000000058
2 .rodata.str1.1 0x000000000000000d
2 .rodata.str1.1 0x0000000000000021
2 .rodata.str1.1 0x000000000000005f
2 .rodata.str1.1 0x000000000000006c
3 .rodata.str1.1 0x000000000000003a
3 .rodata.str1.1 0x000000000000004c
18 .rodata.str1.1 0x0000000000000008
這表明編譯器選項確實有很大的影響,但有一個目標那就是反正使用了18次:部分.rodata
偏移量0xb0
(.rodata.str1.1
偏移量0x8
如果在編譯時啓用優化)。
這就是'\ t%s \ n「字符串。
修改原始程序進入
char *local8 = "char *";
char *const local9 = "char *const";
const char *const fmt = "\t%s\n";
printf("Global:\n");
printf(fmt, global1);
printf(fmt, global2);
等,具有一個不可變字符串指針fmt
替換格式字符串,完全消除那些18重定位。 (也可以使用等效的const char fmt[] = "\t%s\n";
,當然。)
上述分析表明,至少與GCC-4.6.3,大多數可以避免遷移的由(重複使用)字符串文字引起的。用常量字符數組(const char fmt[] = "\t%s\n";
)或常量指針指向常量字符(const char *const fmt = "\t%s\n";
)替換它們 - 這兩種情況都將內容置於.rodata
節,只讀,指針/數組引用本身也是不可變的 - 似乎是有效和安全的戰略給我。
此外,將字符串文字轉換爲不可變字符串指針或字符數組完全是源代碼級任務。也就是說,如果使用上述方法轉換所有字符串文字,則可以消除每個字符串文字至少一次重定位。
事實上,我不明白對象級別的分析對你有多大幫助。它會告訴你,如果你的修改減少了所需的重定位數量,當然。
以上awk
節可以擴展到輸出的正偏移動態引用常量字符串的函數:
#!/bin/bash
if [ $# -ne 1 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
exec >&2
echo ""
echo "Usage: %s [ -h | --help ]"
echo " %s object.o"
echo ""
exit 1
fi
export LANG=C LC_ALL=C
objdump -wr "$1" | awk '
BEGIN {
RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
FS = "[\t\v\f ]+"
}
$1 ~ /^[0-9A-Fa-f]+/ {
n[$3]++
}
END {
for (s in n)
printf "%d %s\n", n[s], s
}
' | sort -g | gawk -v filename="$1" '
BEGIN {
RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
FS = "[\t\v\f ]+"
cmd = "objdump --file-offsets -ws " filename
while ((cmd | getline) > 0)
if ($3 == "section") {
s = $4
sub(/:$/, "", s)
o = $NF
sub(/\)$/, "", o)
start[s] = strtonum(o)
}
close(cmd)
}
{
if ($2 ~ /\..*\+/) {
s = $2
o = $2
sub(/\+.*$/, "", s)
sub(/^[^\+]*\+/, "", o)
o = strtonum(o) + start[s]
cmd = "dd if=\"" filename "\" of=/dev/stdout bs=1 skip=" o " count=256"
OLDRS = RS
RS = "\0"
cmd | getline hex
close(cmd)
RS = OLDRS
gsub(/\\/, "\\\\", hex)
gsub(/\t/, "\\t", hex)
gsub(/\n/, "\\n", hex)
gsub(/\r/, "\\r", hex)
gsub(/\"/, "\\\"", hex)
if (hex ~ /[\x00-\x1F\x7F-\x9F\xFE\xFF]/ || length(hex) < 1)
printf "%s\n", $0
else
printf "%s = \"%s\"\n", $0, hex
} else
print $0
}
'
這是一個有點粗糙,只是拼湊,所以我不知道如何它是便攜式。在我的機器上,它似乎找到了我嘗試使用的少數測試用例的字符串文字;你應該重寫它以符合你自己的需求。甚至可以使用ELF支持的實際編程語言直接檢查目標文件。
對於以上(之前的修改建議,以減少遷移的數量)所示的例子程序,沒有優化編譯,上面的腳本產生的輸出
1 .data+0x000000000000001c = ""
1 .data-0x0000000000000004
1 .rodata
1 .rodata+0x0000000000000044 = ""
1 .rodata+0x00000000000000a8 = "Global:"
1 .rodata+0x00000000000000b5 = "Local:"
1 .rodata+0x00000000000000c0 = "static const char []"
1 .rodata+0x00000000000000d4 = ""
1 .text
1 __stack_chk_fail-0x0000000000000004
1 format
1 global4
1 global5-0x0000000000000004
1 global6-0x0000000000000004
1 global7
1 global8-0x0000000000000004
1 global9-0x0000000000000004
1 putchar-0x0000000000000004
2 .rodata+0x0000000000000015 = "static const char *"
2 .rodata+0x0000000000000029 = "static const char *const"
2 .rodata+0x000000000000005e = "const char *"
2 .rodata+0x000000000000006b = "const char *const"
2 .rodata+0x0000000000000088 = "char *"
2 .rodata+0x000000000000008f = "char *const"
2 puts-0x0000000000000004
18 .rodata+0x00000000000000b0 = "\t%s\n"
18 printf-0x0000000000000004
最後,你可能會發現,使用函數指針printf()
而不是直接調用printf()
會減少另外18個示例代碼的重定位,但我會認爲這是一個錯誤。
對於代碼,您需要想要重定位,因爲間接函數調用(通過函數指針調用)比直接調用慢得多。簡而言之,這些重定位使函數和子程序調用速度更快,所以你絕對要保留這些。
道歉的長答案;希望您覺得這個有幫助。有問題嗎?
基於Nomainal動物的答案,我還是要完全消化掉,我想出了以下簡單的shell腳本,這似乎尋找什麼,我稱之爲「容易可以解決的」各種工作:
for i in path/to/*.o ; do
REL="$(objdump -TtRr "$i" 2>/dev/null | grep '.data.rel.ro.local[^]+-]')"
if [ -n "$REL" ]; then
echo "$(basename "$i"):"
echo "$REL" | c++filt
echo
fi
done
樣本輸出(用於QtGui庫):
qimagereader.o:
0000000000000000 l O .data.rel.ro.local 00000000000000c0 _qt_BuiltInFormats
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qopenglengineshadermanager.o:
0000000000000000 l O .data.rel.ro.local 0000000000000090 QOpenGLEngineShaderManager::getUniformLocation(QOpenGLEngineShaderManager::Uniform)::uniformNames
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qopenglpaintengine.o:
0000000000000000 l O .data.rel.ro.local 0000000000000020 vtable for (anonymous namespace)::QOpenGLStaticTextUserData
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qtexthtmlparser.o:
0000000000000000 l O .data.rel.ro.local 00000000000003b0 elements
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
仰望源文件中的這些符號通常會很快導致了修復,要不,他們是不會輕易可以解決的發現。
但我想我得一次我跑出來的.data.rel.ro.local
s到修復重溫標稱動物的答案...
你有什麼想知道的:需要符號的目標文件?或者包含符號的目標文件或共享庫?你有很多目標文件還是你有一個共享庫? –
@MartinRosenau:包含符號的對象文件。而且我有共享庫和連接它的'.o'文件(我也有源代碼,但是對'static const char * *。* []'''grepping'只會讓我到目前爲止......)。 –