You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why Runtime Error
Enviroment:
RTX3090
CUDA:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Error:
06/19/2021 16:37:19 - INFO - main - device: cuda, n_gpu: 2, 16-bits training: False
06/19/2021 16:51:17 - INFO - main - Start epoch #0 (lr = 4e-05)...
Traceback (most recent call last):
File "code/run_trigger_qa.py", line 629, in
main(args)
File "code/run_trigger_qa.py", line 480, in main
loss = model(input_ids, token_type_ids = segment_ids, attention_mask = input_mask, labels = labels)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 1198, in forward
sequence_output, _ = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 734, in forward
output_all_encoded_layers=output_all_encoded_layers)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 411, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 396, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 354, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 311, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:331
The text was updated successfully, but these errors were encountered:
Is the problem solved? I have the same error and i have checked my environment which is consistent with the requirements.txt. Could anyone give me a hand. Thank you very much!
Why Runtime Error
Enviroment:
RTX3090
CUDA:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Error:
06/19/2021 16:37:19 - INFO - main - device: cuda, n_gpu: 2, 16-bits training: False
06/19/2021 16:51:17 - INFO - main - Start epoch #0 (lr = 4e-05)...
Traceback (most recent call last):
File "code/run_trigger_qa.py", line 629, in
main(args)
File "code/run_trigger_qa.py", line 480, in main
loss = model(input_ids, token_type_ids = segment_ids, attention_mask = input_mask, labels = labels)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/_utils.py", line 369, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 1198, in forward
sequence_output, _ = self.bert(input_ids, token_type_ids, attention_mask, output_all_encoded_layers=False)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 734, in forward
output_all_encoded_layers=output_all_encoded_layers)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 411, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 396, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 354, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/hdu/anaconda3/envs/ace-event-qa/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/zf1/xqs/eeqa-master/code/pytorch_pretrained_bert/modeling.py", line 311, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:331
The text was updated successfully, but these errors were encountered: