2014-03-12 91 views
2

我在執行我的程序時遇到了太多打開的文件異常。典型情況如下:太多打開的文件在「無限」系統下的例外

org.jboss.netty.channel.ChannelException: Failed to create a selector. 

... 
Caused by: java.io.IOException: Too many open files 

但是,那些不是唯一的例外。我觀察到類似的(由「打開的文件過多引起的」),但是這些是很多較不頻繁。

奇怪的是我已經設置的屏幕會打開文件的限制(從那裏我啓動我的程序)爲1M:

[email protected]:~/fabiim-cbench# ulimit -a 
core file size   (blocks, -c) 0 
data seg size   (kbytes, -d) unlimited 
scheduling priority    (-e) 20 
file size    (blocks, -f) unlimited 
pending signals     (-i) 16382 
max locked memory  (kbytes, -l) 64 
max memory size   (kbytes, -m) unlimited 
**open files      (-n) 1000000** 
pipe size   (512 bytes, -p) 8 
POSIX message queues  (bytes, -q) 819200 
real-time priority    (-r) 0 
stack size    (kbytes, -s) 8192 
cpu time    (seconds, -t) unlimited 
max user processes    (-u) unlimited 
virtual memory   (kbytes, -v) unlimited 
file locks      (-x) unlimited 

此外,由lsof -p輸出作爲觀察我看沒有更多的1111在拋出異常之前打開文件(套接字,管道,文件)。

問題:什麼是錯誤的和/或我該如何深入研究這個問題。

額外:我目前正在整合Floodlightbft-smart。簡而言之,在執行由基準測試程序啓動的壓力測試時,泛光燈流程會導致過多的打開文件異常。這個基準測試程序將維護64個tcp連接到泛光燈過程,而這個過程又應該保持至少64 * tcp連接到bft智能複製品。這兩個程序都使用netty來管理這些連接。

+0

你是否以root身份運行jboss? –

+0

我不知道我是否運行jboss(我認爲netty和jboss是兩個單獨的東西)。但我以root身份運行每個進程。 – fabiim

回答

4

首先要檢查的是否可以在Java進程中運行ulimit以確保文件限制在內部是一樣的?這樣的代碼應該工作:

InputStream is = Runtime.getRuntime().exec(new String[] {"bash", "-c", "ulimit -a"}).getInputStream(); 
int c; 
while ((c = is.read()) != -1) { 
    System.out.write(c); 
} 

如果限制仍然顯示1萬元,好了,你因爲某些難以調試。

這裏有幾件事情,我會考慮,如果我有調試這個 -

  1. 你運行了tcp端口號的?當你遇到這個錯誤時,netstat -an會顯示什麼?

  2. 使用strace可以準確找出哪些系統調用哪些參數導致此錯誤被拋出。 EMFILE是返回值24

  3. 的「打開的文件太多」 EMFILE錯誤實際上可以通過許多不同的系統的許多不同的原因調用拋出:

    $ cd /usr/share/man/man2 
    $ zgrep -A 2 EMFILE * 
    accept.2.gz:.B EMFILE 
    accept.2.gz:The per-process limit of open file descriptors has been reached. 
    accept.2.gz:.TP 
    accept.2.gz:-- 
    accept.2.gz:.\" EAGAIN, EBADF, ECONNABORTED, EINTR, EINVAL, EMFILE, 
    accept.2.gz:.\" ENFILE, ENOBUFS, ENOMEM, ENOTSOCK, EOPNOTSUPP, EPROTO, EWOULDBLOCK. 
    accept.2.gz:.\" In addition, SUSv2 documents EFAULT and ENOSR. 
    dup.2.gz:.B EMFILE 
    dup.2.gz:The process already has the maximum number of file 
    dup.2.gz:descriptors open and tried to open a new one. 
    epoll_create.2.gz:.B EMFILE 
    epoll_create.2.gz:The per-user limit on the number of epoll instances imposed by 
    epoll_create.2.gz:.I /proc/sys/fs/epoll/max_user_instances 
    eventfd.2.gz:.B EMFILE 
    eventfd.2.gz:The per-process limit on open file descriptors has been reached. 
    eventfd.2.gz:.TP 
    execve.2.gz:.B EMFILE 
    execve.2.gz:The process has the maximum number of files open. 
    execve.2.gz:.TP 
    execve.2.gz:-- 
    execve.2.gz:.\" document ETXTBSY, EPERM, EFAULT, ELOOP, EIO, ENFILE, EMFILE, EINVAL, 
    execve.2.gz:.\" EISDIR or ELIBBAD error conditions. 
    execve.2.gz:.SH NOTES 
    fcntl.2.gz:.B EMFILE 
    fcntl.2.gz:For 
    fcntl.2.gz:.BR F_DUPFD , 
    getrlimit.2.gz:.BR EMFILE . 
    getrlimit.2.gz:(Historically, this limit was named 
    getrlimit.2.gz:.B RLIMIT_OFILE 
    inotify_init.2.gz:.B EMFILE 
    inotify_init.2.gz:The user limit on the total number of inotify instances has been reached. 
    inotify_init.2.gz:.TP 
    mmap.2.gz:.\" SUSv2 documents additional error codes EMFILE and EOVERFLOW. 
    mmap.2.gz:.SH AVAILABILITY 
    mmap.2.gz:On POSIX systems on which 
    mount.2.gz:.B EMFILE 
    mount.2.gz:(In case no block device is required:) 
    mount.2.gz:Table of dummy devices is full. 
    open.2.gz:.B EMFILE 
    open.2.gz:The process already has the maximum number of files open. 
    open.2.gz:.TP 
    pipe.2.gz:.B EMFILE 
    pipe.2.gz:Too many file descriptors are in use by the process. 
    pipe.2.gz:.TP 
    shmop.2.gz:.\" SVr4 documents an additional error condition EMFILE. 
    shmop.2.gz: 
    shmop.2.gz:In SVID 3 (or perhaps earlier) 
    signalfd.2.gz:.B EMFILE 
    signalfd.2.gz:The per-process limit of open file descriptors has been reached. 
    signalfd.2.gz:.TP 
    socket.2.gz:.B EMFILE 
    socket.2.gz:Process file table overflow. 
    socket.2.gz:.TP 
    socketpair.2.gz:.B EMFILE 
    socketpair.2.gz:Too many descriptors are in use by this process. 
    socketpair.2.gz:.TP 
    spu_create.2.gz:.B EMFILE 
    spu_create.2.gz:The process has reached its maximum open files limit. 
    spu_create.2.gz:.TP 
    timerfd_create.2.gz:.B EMFILE 
    timerfd_create.2.gz:The per-process limit of open file descriptors has been reached. 
    timerfd_create.2.gz:.TP 
    truncate.2.gz:.\" error conditions EMFILE, EMULTIHP, ENFILE, ENOLINK. SVr4 documents for 
    truncate.2.gz:.\" .BR ftruncate() 
    truncate.2.gz:.\" an additional EAGAIN error condition. 
    

    如果檢查了所有這些手冊頁的手,你可能會發現有趣的事情舉例來說,我認爲這是有趣的是epoll_create,底層系統調用所使用的NIO渠道,將返回EMFILE「打開的文件太多」如果

    每個用戶限制在所規定的epoll實例的數量遇到/ proc/sys/fs/epoll/max_user_instances時出現 。進一步的細節見 epoll(7)。

    現在,文件名實際上並不存在我的系統上,但也有在/proc/sys/fs/epoll/proc/sys/fs/inotify文件,你可能會擊中定義的一些限制,尤其是如果你正在運行在相同的測試的多個實例同一臺機器。在搞清楚,如果是這樣的話是件苦差事本身,你可以通過檢查任何消息的syslog啓動...

祝你好運!

相關問題