我有許多樹的xgboost.dump文本文件。 我想查找所有路徑以獲取每個路徑的值。 這是一棵樹。找到來自xgboost.dump的二叉樹的所有路徑
tree[0]:
0:[a<0.966398] yes=1,no=2,missing=1
1:[b<0.323071] yes=3,no=4,missing=3
3:[c<0.461248] yes=7,no=8,missing=7
7:leaf=0.00972768
8:leaf=-0.0179376
4:[a<0.379082] yes=9,no=10,missing=9
9:leaf=0.0146003
10:leaf=0.0454369
2:[b<0.322352] yes=5,no=6,missing=5
5:[c<0.674868] yes=11,no=12,missing=11
11:leaf=0.0497964
12:leaf=0.00953781
6:[f<0.598267] yes=13,no=14,missing=13
13:leaf=0.0504545
14:leaf=0.0867654
我想所有的路徑轉換成
path1, a<0.966398, b<0.323071, c<0.461248, leaf = 0.00097268
path2, a<0.966398, b<0.323071, c>0.461248, leaf = -0.0179376
path3, a<0.966398, b>0.323071, a<0.379082, leaf = 0.0146003
path4, a<0.966398, b>0.323071, a>0.379082, leaf = 0.0454369
path5, a>0.966398, b<0.322352, c<0.674868, leaf = 0.0497964
path6, a>0.966398, b<0.322352, c>0.674868, leaf = 0.00953781
path7, a>0.966398, b>0.322352, f<0.598267, leaf = 0.0504545
path8, a>0.966398, b>0.322352, f>0.598267, leaf = 0.0864654
我已經嘗試列出像
array([[ 0, 1, 3, 7],
[ 0, 1, 3, 8],
[ 0, 1, 4, 9],
[ 0, 1, 4, 10],
[ 0, 2, 5, 11],
[ 0, 2, 5, 12],
[ 0, 2, 6, 13],
[ 0, 2, 6, 14]])
所有可能的路徑,但一旦MAX_DEPTH較高這樣會導致錯誤,一些分支將停止增長,路徑將錯誤。 所以我需要解析文本文件中的yes,no來生成真實的,正確的路徑。 有什麼建議嗎? 謝謝!