2017-09-25 47 views
0

我試圖攔截在ubuntu14.04在pthread_create,代碼是這樣的:攔截pthread_create的Linux的功能,導致JVM/SSH崩潰

struct thread_param{ 
    void * args; 
    void *(*start_routine) (void *); 
}; 

typedef int(*P_CREATE)(pthread_t *thread, const pthread_attr_t *attr,void * 
    (*start_routine) (void *), void *arg); 

void *intermedia(void * arg){ 

struct thread_param *temp; 
temp=(struct thread_param *)arg; 
//do some other things 
return temp->start_routine(temp->args); 
} 

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void * 
(*start_routine)(void *), void *arg){ 
    static void *handle = NULL; 
    static P_CREATE old_create=NULL; 
    if(!handle) 
    { 
     handle = dlopen("libpthread.so.0", RTLD_LAZY); 
     old_create = (P_CREATE)dlsym(handle, "pthread_create"); 
    } 
    struct thread_param temp; 
    temp.args=arg; 
    temp.start_routine=start_routine; 

    int result=old_create(thread,attr,intermedia,(void *)&temp); 
//  int result=old_create(thread,attr,start_routine,arg); 
    return result; 
} 

它可以正常工作,我自己在pthread_create測試用例(用C語言編寫)。但是當我在jvm上使用hadoop時,它給了我這樣的錯誤報告:

Starting namenodes on [ubuntu] 
ubuntu: starting namenode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-namenode-ubuntu.out 
ubuntu: starting datanode, logging to /home/yangyong/work/hadooptrace/hadoop-2.6.5/logs/hadoop-yangyong-datanode-ubuntu.out 
ubuntu: /home/yangyong/work/hadooptrace/hadoop-2.6.5/sbin/hadoop-daemon.sh: line 131: 7545 Aborted     (core dumped) nohup nice -n 
$HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "[email protected]" > "$log" 2>&1 < /dev/null 
Starting secondary namenodes [0.0.0.0 
# 
# A fatal error has been detected by the Java Runtime Environment: 
# 
# SIGSEGV (0xb) at pc=0x0000000000000000, pid=7585, tid=140445258151680 
# 
# JRE version: OpenJDK Runtime Environment (7.0_121) (build 1.7.0_121-b00) 
# Java VM: OpenJDK 64-Bit Server VM (24.121-b00 mixed mode linux-amd64 compressed oops) 
# Derivative: IcedTea 2.6.8 
# Distribution: Ubuntu 14.04 LTS, package 7u121-2.6.8-1ubuntu0.14.04.1 
# Problematic frame: 
# C 0x0000000000000000 
# 
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again 
# 
# An error report file with more information is saved as: 
# /home/yangyong/work/hadooptrace/hadoop-2.6.5/hs_err_pid7585.log 
# 
# If you would like to submit a bug report, please include 
# instructions on how to reproduce the bug and visit: 
# http://icedtea.classpath.org/bugzilla 
#] 
A: ssh: Could not resolve hostname a: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
fatal: ssh: Could not resolve hostname fatal: Name or service not known 
been: ssh: Could not resolve hostname been: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
#: ssh: Could not resolve hostname #: Name or service not known 
^COpenJDK: ssh: Could not resolve hostname openjdk: Name or service not known 
detected: ssh: Could not resolve hostname detected: Name or service not known 
version:: ssh: Could not resolve hostname version:: Name or service not known 
JRE: ssh: Could not resolve hostname jre: Name or service not known 

我的代碼有什麼問題嗎?還是因爲其他的東西像JVM或SSH的保護機制? 謝謝。

+0

還有另一個類似的錯誤示例:[鏈接](https://sourceware.org/ml/glibc-linux/2001-q1/msg00048.html) – Chalex

回答

0

此代碼會導致子線程具有無效arg值:

struct thread_param temp; 
    temp.args=arg; 
    temp.start_routine=start_routine; 

    int result=old_create(thread,attr,intermedia,(void *)&temp); 
//  int result=old_create(thread,attr,start_routine,arg); 
    return result; // <-- temp and its contents are now invalid 

temp不能保證存在了在新線程作爲父調用您pthread_create()可能已經返回,無效的值它包含。

+0

謝謝!它解決了這個問題! – Chalex

0

這是你的代碼中的一堆問題。我不知道哪些(如果有的話)會導致您遇到的問題,但您一定要修復它們。

首先,您可以打開核心轉儲(通常使用ulimit -c unlimited)並將核心加載到GDB中。看看回溯指向什麼。

不要dlopen pthreads。相反,你應該只能使用dlsym(RTLD_NEXT, "pthread_create")

但是,最可能的麻煩來源是將原始參數存儲在全局變量中。這意味着如果某人(比如Java運行時)同時打開大量線程,那麼您將混淆意圖做什麼。

+0

謝謝你的回答。對於第一點,我對gdb調試不是很熟悉,之後我開啓了它,但我仍然無法弄清楚問題所在。第二點,如果我只是使用dlsym(RTLD_NEXT,「pthread_create」),它會拋出警告,並且jvm仍然會崩潰。第三點,我不太確定哪個變量是全局的。無論如何,謝謝你的及時迴應。 – Chalex