2017-04-25 74 views
0

我從P2.xlarge類型的AWS實例運行此模型。它是給一個錯誤:MemoryError tensorflow

Exception in thread Thread-16: 
Traceback (most recent call last): 
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner 
self.run() 
File "/usr/lib/python2.7/threading.py", line 754, in run 
self.__target(*self.__args, **self.__kwargs) 
File "/home/ubuntu/tensorflow/models/summarization/textsum/batch_reader.py" , line 136, in _FillInputQueue 
(article, abstract) = input_gen.next() 
File "/home/ubuntu/tensorflow/models/summarization/textsum/batch_reader.py", line 245, in _TextGenerator 
e = example_gen.next() 
File "/home/ubuntu/tensorflow/models/summarization/textsum/data.py", line 109, in ExampleGen 
example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0] 
MemoryError 

系統存儲的信息是 -

Filesystem Size Used Avail Use% Mounted on 
udev 30G 0 30G 0% /dev 
tmpfs 6.0G 8.9M 6.0G 1% /run 
/dev/xvda1 30G 12G 18G 39%/
tmpfs 30G 0 30G 0% /dev/shm 
tmpfs 5.0M 0 5.0M 0% /run/lock 
tmpfs 30G 0 30G 0% /sys/fs/cgroup 
tmpfs 6.0G 0 6.0G 0% /run/user/1000 

NVIDIA狀態 -

[email protected]:~$ lspci | grep -i nvidia 

00:1e.0 3D控制器:NVIDIA公司GK210GL [特斯拉K80 ](rev a1)

這是什麼解決方案?

如果我更換str_len = struct.unpack('q', len_bytes)[0]str_len = struct.unpack('Bi', len_bytes)[0] 那麼這個錯誤消失,新的錯誤上來如:

Exception in thread Thread-15: 
Traceback (most recent call last): 
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner 
self.run() 
File "/usr/lib/python2.7/threading.py", line 754, in run 
self.__target(*self.__args, **self.__kwargs) 
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 136, in _FillInputQueue 
(article, abstract) = input_gen.next() 
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 248, in _TextGenerator 
article_text = self._GetExFeatureText(e, self._article_key) 
File "/home/mindstix/bazel/models/Summarizer/textsum/batch_reader.py", line 265, in _GetExFeatureText 
return ex.features.feature[key].bytes_list.value[0] 
IndexError: list index (0) out of range 

如果我在屏幕上打印example_str然後該值顯示。但是,當我嘗試打印ex.features.feature[key].bytes_list.value時,它將返回空白。

應該怎麼辦才能解決這一切?

這是我下面的代碼步驟:

>>> import tensorflow as tf 
>>> import struct 
>>>from tensorflow.core.example import example_pb2 
>>> reader = open('data/training-1', 'rb') 
>>> len_bytes = reader.read(8) 
>>> str_len = struct.unpack('q', len_bytes)[0] 
>>> str_len 
2335523720558635124 
>>> example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0] 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
MemoryError 

>>> str_len = struct.unpack('Bi', len_bytes)[0] 
>>> str_len 
116 

>>> example_str = struct.unpack('%ds' % str_len, reader.read(str_len))[0] 
>>>e = example_pb2.Example.FromString(example_str) 
>>> e.features.feature['article'].bytes_list.value 
<google.protobuf.pyext._message.RepeatedScalarContainer object at 0x7fc25c9325a8> 

>>> e.features.feature['article'].bytes_list.value[0] 
Traceback (most recent call last): 
File "<stdin>", line 1, in <module> 
IndexError: list index (0) out of range 
+0

如果沒有其他代碼作爲上下文,很難說出任何內容。你能把它凝聚成一個最小但可運行的例子嗎? –

+0

@AllenLavoie我已經用示例代碼更新了這個問題,我試圖用tensorflow來運行它。 –

+0

所以文章功能是空的?有沒有理由認爲它不應該是?打印整個例子('print(e)')來查看被解析​​的內容可能是有用的。也不知道'struct'用法是怎麼回事:也許[TFRecord](https://www.tensorflow.org/api_guides/python/python_io)格式會是更穩定的存儲格式? –

回答

0

我得到堆放着同樣的問題。但原因是我使用原始文本文件進行測試。它應該使用傳輸的二進制文件。我不確定你的情況與我是否一致。

+0

我的錯誤已解決。該錯誤是由於輸入文件的二進制格式對tensorflow/textsum不正確。讀取()函數需要讀取數據字節的值。'example_str = struct.unpack('%ds'%str_len,reader.read(str_len))[0]' read我傳遞了無效的大小。結果是 'e.features.feature ['article']。bytes_list.value'沒有任何東西。那時該對象是空白的。 我試圖將文本格式轉換爲tensorflow可接受的格式。 使用[https://github.com/surmenok/TextSum/blob/master/textsum_data_convert.py]。 –