Windows 10: Error! Illegal memory address
Discus and support Error! Illegal memory address in Windows 10 Ask Insider to solve the problem; Hello,
I am trying to run an old pc game on my laptop. I’ve changed the graphics to suit the old game, I’ve downloaded DirectX but an error keeps…
Discussion in ‘Windows 10 Ask Insider’ started by /u/HappyViking_, May 2, 2020.
-
Error! Illegal memory address
Hello,
I am trying to run an old pc game on my laptop. I’ve changed the graphics to suit the old game, I’ve downloaded DirectX but an error keeps popping up “Illegal memory address”.
Does anyone know what this means and how I can fix it please?
submitted by /u/HappyViking_
[link] [comments] -
cuda error — an illegal memory access encountered
I have a P106-100 Graphics card which crashed halfway through mining.
When I now run the miner software I get «cuda error — an illegal memory access encountered»
I have rebooted my PC and I still cannot resolve the issue.
Is my graphics card knackered ? is there a way to reset the memory (I am assuming somehow the memory is not being refreshed after a device power off)
-
BSOD Memory Addresses
Alot of times the issue is a bad device driver that mishandles memory and so windows shuts down to prevent eh driver from overwriting some part of the active pages of other processes as there is no alternative.
If you post the whole error or the dumps sometimes you can just look at the memory address the fault occurred at, and the look through the process list to see what was using it, and what tried to access it.
If it is a hardware failure then you CAN use memtest X86 or X64 to diagnose that part, and you CAN run certain tests on your CPU, however som eof the advanced functions may not be in ANY test, such as the decode part that may be used for specific instruction sets causing corruption of items processed.
However Occam’s Razor, and we have a simple driver, software or misconfiguration error.
-
Error! Illegal memory address
Overclocking / Undervolting guide for Vega 56 or 64?
Here’s a quick laundry list:
List of software to use for overclocking and testing
Examples:
Wattman (and how to find and use it, like an overview, including profiles)
Unigine Valley or Heaven (use this for quick testing while changing settings in Wattman and checking for stability / artifacts) …just suggesting this
How to monitor cores / mem speeds and temps during testing (I’ve seen screen overlays, and others using GPUz)Step-by step overclocking in Wattman
Fan speeds
Power limit
Temp limit
Voltages
Core speeds
Memory speeds
Error! Illegal memory address
-
Error! Illegal memory address — Similar Threads — Error Illegal memory
-
Is this Batchscript illegal
in Windows 10 Updates and Activation
Is this Batchscript illegal: Hello,today I was on Discord And Some people are sending me this batch script:»slmgr/ipk <product_key_That_ends_with_T83GX> slmgr /skms kms.digiboy.ir slmgr /ato «so I know What this script does but is it illegal? Sorry for «<product_key_That_ends_with_T83GX>» but i dont know… -
Is this Batchscript illegal
in Windows 10 Software and Apps
Is this Batchscript illegal: Hello,today I was on Discord And Some people are sending me this batch script:»slmgr/ipk <product_key_That_ends_with_T83GX> slmgr /skms kms.digiboy.ir slmgr /ato «so I know What this script does but is it illegal? Sorry for «<product_key_That_ends_with_T83GX>» but i dont know… -
illegal state change requested error.
in Windows 10 Gaming
illegal state change requested error.: i just got the new forza 5 horizon and noticed when i try to make a convoy/party on pc it asked me to login and i did and it prompted me that there was a illegal state change requested and it can’t sign me in. however i can login to my account fine on google or any browser… -
illegal state change requested error.
in Windows 10 Software and Apps
illegal state change requested error.: i just got the new forza 5 horizon and noticed when i try to make a convoy/party on pc it asked me to login and i did and it prompted me that there was a illegal state change requested and it can’t sign me in. however i can login to my account fine on google or any browser… -
illegal state change requested error.
in Microsoft Windows 10 Store
illegal state change requested error.: i just got the new forza 5 horizon and noticed when i try to make a convoy/party on pc it asked me to login and i did and it prompted me that there was a illegal state change requested and it can’t sign me in. however i can login to my account fine on google or any browser… -
Is this good or illegal?
in Windows 10 Updates and Activation
Is this good or illegal?: I was looking at Win10 Pro for Workstations in MS Store and clicked install FOR FUN, thinking that the thing would exit because I didn’t have a product key, but when I clicked install, I got scared something would happen and closed Store and it’s task in Task Manager. But,… -
Error code: Status illegal instruction
in Windows 10 BSOD Crashes and Debugging
Error code: Status illegal instruction: Hello,I would like some help with an issue that occurs while i’m watching a youtube video on my laptop. It happens a lot lately and is very annoying. So, while i’m watching a video, the page stops working and the message «oops something went wrong» appears on my screen…
-
illegible font
in Windows 10 Customization
illegible font: Hiafter iave chenge to dark background my font is not clear and can not see/ invisible
how do I change to normal
https://answers.microsoft.com/en-us/windows/forum/all/illegible-font/ecf11183-d954-4702-8547-e36e6d94352b
-
IS IT ILLEGAL?
in Windows 10 Updates and Activation
IS IT ILLEGAL?: I am not a legal expert, and would like to know if it is in theory illegal to share (for free) a functional, NOT-ACTIVATED, non-malicious, windows ISO, that has been stripped down to be less hardware intensive with performance tweaks prepared to a friend, considering they…
Users found this page by searching for:
-
realtek illegal memory address error popup

Invalid Memory Address or Nil Pointer Dereference
Overview
Example error:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x497783]
This issue happens when you reference an unassigned (nil) pointer.
Go uses an asterisk *
to specify a pointer to a variable, and an ampersand &
to generate a pointer to a given variable. For example:
var p *int
— p is a pointer to a type integerp := &i
— p is a pointer to variable i
Initial Steps Overview
-
Check if the pointer is being set
-
Check for a nil assignment to the pointer
Detailed Steps
1) Check if the pointer is being set
Before a pointer can be referenced, it needs to have something assigned to it.
type person struct {
Name string
Age int
}
func main() {
var myPointer *person
fmt.Println(myPointer.Name)
}
$ go run main.go
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x497783]
The above code causes Go to panic because, while myPointer
has been declared as a pointer to a person
object, it currently does not point to a particular instance of such an object. This can be solved by assigning a variable to the pointer (Solution A) or by creating and referencing a new object (Solution B).
2) Check for a nil assignment to the pointer
Another possibility is that the pointer is being set to nil
somewhere in your code. This could be, for example, a function returning nil
after failing to accomplish a task.
type NumberObject struct {
number int
}
func createNumberObj(num int) (result *NumberObject) {
// Returns a pointer to a NumberObject if the number is allowed, otherwise returns nil
if num < 100 {
numberObj := NumberObject{num}
return &numberObj // Returns a reference to the new object
} else {
return nil
}
}
func main() {
myNumberObject := createNumberObject(101)
fmt.Println(myNumberObject.number) // This will cause Go to panic!
}
Here a NumberObject
is created only if the input is less than 100. A pointer is returned by the function either way, but the main()
function is not prepared to handle the nil
pointer that is returned if the conditions aren’t met, causing Go to panic. It would be best to handle this issue by either handling the nil pointer as demonstrated in Solution B, or by creaing and referencing an empty object as per Solution C.
Solutions List
A) Assign a variable to the pointer
B) Handle the nil pointer
C) Create and reference a new variable
Solutions Detail
A) Assign a variable to the pointer
type Person struct{
Name string
Age int
}
func main()
var myPointer *Person // This is currently a nil pointer
aPerson := Person{
Name: "Joey",
Age: 29,
} // This is a Person object
myPointer = &aPerson // Now the pointer references the same Person as 'aPerson'
fmt.Println(joey.Name)
}
Here the variable aPerson
is created, after which we can use the Go &
syntax to get this variable’s reference (location in memory) and assign it to myPointer
. Importantly, both myPointer
and aPerson
now point to the same variable in memory, and modifications made to either will apply to both.
func main()
var myPointer *person
myPointer = &anotherPerson
anotherPerson.Name = "Pete"
fmt.Println(myPointer.Name) // This will print "Pete"!
}
B) Handle the nil pointer
Going back to the code used in Step 2, we can expand this to check for and ‘handle’ a nil value before the code continues.
type NumberObject struct {
number int
}
func createNumberObj(num int) (result *NumberObject) {
// Returns a pointer to a NumberObject if the number is allowed, otherwise returns nil
if num < 100 {
numberObj := NumberObject{num}
return &numberObj // Returns a reference to the new object
} else {
return nil
}
}
func main() {
myNumberObject := createNumberObject(101)
if myNumberObject == nil {
log.Fatal("Failed to create number object!")
}
fmt.Println(myNumberObject.number) // This line is not reached if num >= 100!
}
There are several ways this situation could be handled; in this particular instance the log.Fatal
function will terminate the program if myNumberObject is a nil pointer. Another option would be to return an error in the createNumberObj
function informing the user that the inputted number was too high. A further option, and one which would allow the program to continute executing, would be to create and reference a new variable, as seen in Solution C.
C) Create and reference a new variable
type NumberObject struct {
number int
}
func createNumberObj(num int) (result *NumberObject) {
numberObj := NumberObject{} // Creates a new, empty NumberObject
if num < 100 {
numberObj.number = num
}
return &numberObj // Returns a reference to the new object
}
func main() {
myNumberObject := createNumberObject(101)
fmt.Println(myNumberObject.number) // Will print 0 instead of causing a panic!
}
Here we have restructured the program to first create a new variable, and then edit the properties of this object as the program executes, removing the risk of the function returning a nil pointer. In the event it is not explicitly set, the value of NumberObject.number
will default to 0, rather than nil as you might expect — this is because Go has default ‘zero values’ for all it’s types, which you can read more about here.
Further Information
Gotcha Nil Pointer Dereference
Golang default zero values
Go Pointers
Owner
Joey
I am trying to setup a deep learning machine with dual rtx 3070 gpus in Ubuntu 20.04. I have installed Nvidia drivers 460,CUDA 11.2 and Cudnn 8.1. When i try to test the gpu with a sample tensorflow code i am getting CUDA_ERROR_ILLEGAL_ADDRESS on both GPUs. Can someone let me know what the issue is?
Hitting this issue in Python3.8 and 3.9 and also in tensorflow 2.5.0,2.9.0
Mon Jun 27 14:10:22 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3070 Off | 00000000:09:00.0 Off | N/A |
| 57% 46C P8 28W / 270W | 15MiB / 7979MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 3070 Off | 00000000:0A:00.0 Off | N/A |
| 0% 48C P8 23W / 270W | 5MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1264 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1463 G /usr/bin/gnome-shell 3MiB |
| 1 N/A N/A 1264 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
2022-06-27 13:59:11.843491: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-27 13:59:12.901104: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-06-27 13:59:12.943243: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.943685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-27 13:59:12.943725: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.944127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:0a:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-27 13:59:12.944141: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-27 13:59:12.945421: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-06-27 13:59:12.945447: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-06-27 13:59:12.945900: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-06-27 13:59:12.946021: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-06-27 13:59:12.946360: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-06-27 13:59:12.946647: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-06-27 13:59:12.946717: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-06-27 13:59:12.946758: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.947192: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.947610: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.948028: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:12.948421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
2022-06-27 13:59:12.948662: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-27 13:59:13.250592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.250986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-27 13:59:13.251025: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.251384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:0a:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.815GHz coreCount: 46 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2022-06-27 13:59:13.251420: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.251794: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.252260: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.252633: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.252984: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1
2022-06-27 13:59:13.253013: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-27 13:59:13.721007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-06-27 13:59:13.721033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 1
2022-06-27 13:59:13.721041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N N
2022-06-27 13:59:13.721045: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 1: N N
2022-06-27 13:59:13.721180: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.721614: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.722002: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.722382: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.722757: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.723125: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6114 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:09:00.0, compute capability: 8.6)
2022-06-27 13:59:13.723332: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-27 13:59:13.723699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 6126 MB memory) -> physical GPU (device: 1, name: GeForce RTX 3070, pci bus id: 0000:0a:00.0, compute capability: 8.6)
2022-06-27 13:59:14.182153: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2022-06-27 13:59:14.201622: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3700310000 Hz
Epoch 1/10
2022-06-27 13:59:14.370438: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-06-27 13:59:14.763996: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-06-27 13:59:14.764035: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
1563/1563 [==============================] - 3s 1ms/step - loss: 1.8131 - accuracy: 0.3549
Epoch 2/10
500/1563 [========>.....................] - ETA: 1s - loss: 1.6540 - accuracy: 0.4167Traceback (most recent call last):
File "/home/vicky/testtf/testf.py", line 24, in <module>
model_gpu.fit(X_train_scaled, y_train_encoded, epochs = 10)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/engine/training.py", line 1188, in fit
callbacks.on_train_batch_end(end_step, logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 457, in on_train_batch_end
self._call_batch_hook(ModeKeys.TRAIN, 'end', batch, logs=logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 317, in _call_batch_hook
self._call_batch_end_hook(mode, batch, logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 337, in _call_batch_end_hook
self._call_batch_hook_helper(hook_name, batch, logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 375, in _call_batch_hook_helper
hook(batch, logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 1029, in on_train_batch_end
self._batch_update_progbar(batch, logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/callbacks.py", line 1101, in _batch_update_progbar
logs = tf_utils.sync_to_numpy_or_python_type(logs)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/utils/tf_utils.py", line 519, in sync_to_numpy_or_python_type
return nest.map_structure(_to_single_numpy_or_python_type, tensors)
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/util/nest.py", line 867, in map_structure
structure[0], [func(*x) for x in entries],
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/util/nest.py", line 867, in <listcomp>
structure[0], [func(*x) for x in entries],
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/keras/utils/tf_utils.py", line 515, in _to_single_numpy_or_python_type
x = t.numpy()
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 1094, in numpy
maybe_arr = self._numpy() # pylint: disable=protected-access
File "/home/vicky/testtf/tf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 1062, in _numpy
six.raise_from(core._status_to_exception(e.code, e.message), None) # pylint: disable=protected-access
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: Could not synchronize CUDA stream: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
This is the sample code i am running
import tensorflow as tf
from tensorflow import keras
import numpy as np
(X_train, y_train), (X_test, y_test) = keras.datasets.cifar10.load_data()
# scaling image values between 0-1
X_train_scaled = X_train/255
X_test_scaled = X_test/255
# one hot encoding labels
y_train_encoded = keras.utils.to_categorical(y_train, num_classes = 10, dtype = 'float32')
y_test_encoded = keras.utils.to_categorical(y_test, num_classes = 10, dtype = 'float32')
def get_model():
model = keras.Sequential([
keras.layers.Flatten(input_shape=(32,32,3)),
keras.layers.Dense(3000, activation='relu'),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dense(10, activation='sigmoid')
])
model.compile(optimizer='SGD',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
with tf.device('/GPU:0'):
model_gpu = get_model()
model_gpu.fit(X_train_scaled, y_train_encoded, epochs = 10)
I get the same error on AWS instances running in my EKS cluster.
My environment:
- instance type: p2.xlarge (having K80 GPUs)
- architecture: linux (amd64)
- AMI: 0dec13e4e8336258b (Amazon EKS optimized accelerated Amazon Linux AMI, it is based on Amazon Linux 2)
- Kernel version: 5.4.188-104.359.amzn2.x86_64
- Tensorflow 2.8.0 (my image is based on
tensorflow/tensorflow:2.8.0-gpu
) - nvidia-smi output: NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4
- cuDNN version 8400
- container runtime: docker 20.10.13 (this is actually weird because the AMI’s documentation says it uses
nvidia-container-runtime
by default) - kubelet version: v1.22.6-eks
EDIT: The kubernetes cluster is running with nvidia’s device plugin, namely nvcr.io/nvidia/k8s-device-plugin:v0.9.0
. The version v0.9.0 has automatically been selected during cluster setup via eksctl
.
EDIT 2: When I add CUDA_LAUNCH_BLOCKING=1
to the environment variables of my container, nothing changes. By the way, I don’t know if this is related, but before the «illegal memory» error I get multiple occurrences of this message: "tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero"
.
This is my full log output:
2022-05-05 15:21:23.492591: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.500187: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.501210: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.502565: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-05-05 15:21:23.503011: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.503983: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.505057: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.990100: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.991095: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.992002: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-05-05 15:21:23.992862: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10794 MB memory: -> device: 0, name: Tesla K80, pci bus id: 0000:00:1e.0, compute capability: 3.7
2022-05-05 15:21:27,865 | train_model.py | INFO | Start of training for 10 epochs
2022-05-05 15:21:41.989628: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
2022-05-05 15:21:43.244979: E tensorflow/stream_executor/gpu/gpu_timer.cc:87] INTERNAL: Error recording CUDA event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-05-05 15:21:43.245036: E tensorflow/stream_executor/gpu/gpu_timer.cc:55] INTERNAL: Error destroying CUDA event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-05-05 15:21:43.245053: E tensorflow/stream_executor/gpu/gpu_timer.cc:60] INTERNAL: Error destroying CUDA event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-05-05 15:21:43.245110: E tensorflow/stream_executor/stream.cc:4476] INTERNAL: Failed to enqueue async memset operation: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-05-05 15:21:43.245131: E tensorflow/stream_executor/stream.cc:4476] INTERNAL: Failed to enqueue async memset operation: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-05-05 15:21:43.245169: F tensorflow/stream_executor/cuda/cuda_dnn.cc:215] Check failed: status == CUDNN_STATUS_SUCCESS (7 vs. 0)Failed to set cuDNN stream.
EDIT 4: When the error occurs, I get the following messages in the instance’s dmesg output:
[ 449.073780] NVRM: GPU at PCI:0000:00:1e: GPU-42f4594a-a444-69c0-a059-cf233ef43e4b
[ 449.080560] NVRM: Xid (PCI:0000:00:1e): 31, pid=16615, Ch 00000017, intr 10000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_0 faulted @ 0x0_00000000. Fault is of type FAULT_PDE ACCESS_TYPE_READ
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
package main import ( "fmt" "go_blog/models" "net/http" "text/template" ) //здесь хранятся данные формы var posts map[string]*models.Post //функция индексации главной страницы func indexHandler(w http.ResponseWriter, r *http.Request) { t, err := template.ParseFiles("templates/index.html", "templates/header.html", "templates/footer.html") if err != nil { fmt.Fprintf(w, err.Error()) return } fmt.Println(posts) t.ExecuteTemplate(w, "index", posts) } //функция индексации формы func writeHandler(w http.ResponseWriter, r *http.Request) { t, err := template.ParseFiles("templates/write.html", "templates/header.html", "templates/footer.html") if err != nil { fmt.Fprintf(w, err.Error()) return } t.ExecuteTemplate(w, "write", nil) } //функция обработки формы редактированя func editHandler(w http.ResponseWriter, r *http.Request) { t, err := template.ParseFiles("templates/write.html", "templates/header.html", "templates/footer.html") if err != nil { fmt.Fprintf(w, err.Error()) return } id := r.FormValue("id") post, found := posts[id] if !found { http.NotFound(w, r) } t.ExecuteTemplate(w, "write", post) } //функция сохранения данных формы func savePostHandler(w http.ResponseWriter, r *http.Request) { id := r.FormValue("id") title := r.FormValue("title") content := r.FormValue("content") var post *models.Post if id != "" { post = posts[id] post.Title = title post.Content = content } else { id = GenerateID() post := models.NewPost(id, title, content) posts[post.Id] = post } http.Redirect(w, r, "/", 302) } func main() { fmt.Println("Listening port: 3000") posts = make(map[string]*models.Post, 0) // /css/app.css http.Handle("/assets/", http.StripPrefix("/assets/", http.FileServer(http.Dir("./assets")))) http.HandleFunc("/", indexHandler) http.HandleFunc("/write", writeHandler) http.HandleFunc("/edit", editHandler) http.HandleFunc("/SavePost", savePostHandler) http.ListenAndServe(":3000", nil) } |