forked from NVIDIA/apex
-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Labels
bugSomething isn't workingSomething isn't working
Description
hi, team,
I tried to benchmark on mlp implement with following:
env setup
GPU: MI308
rocm: 6.1.2.60102-119~20.04
pytorch: 2.4.0.dev20240501+rocm6.1how to duplicate the process
cd apex/
pip install -r requirements.txt
python3 setup.py install
cd tests/run_mlp
python3 test_mlp.pyaccuracy errors
FAIL: test_no_bias (__main__.TestMLP)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/apex-1.3.0-py3.9-linux-x86_64.egg/apex/testing/common_utils.py", line 32, in wrapper
fn(*args, **kwargs)
File "/workspace/apex/tests/L0/run_mlp/test_mlp.py", line 77, in test_no_bias
np.testing.assert_allclose(
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-05, atol=1e-07
Mismatched elements: 2 / 1024 (0.195%)
Max absolute difference: 1.8835999e-07
Max relative difference: 0.00286722
x: array([[ 0.027259],
[-0.054091],
[-0.003985],...
y: array([[ 0.027259],
[-0.054091],
[-0.003985],...
FAIL: test_no_grad (__main__.TestMLP)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/apex-1.3.0-py3.9-linux-x86_64.egg/apex/testing/common_utils.py", line 32, in wrapper
fn(*args, **kwargs)
File "/workspace/apex/tests/L0/run_mlp/test_mlp.py", line 163, in test_no_grad
np.testing.assert_allclose(
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-05, atol=1e-07
Mismatched elements: 140028 / 491520 (28.5%)
Max absolute difference: 7.2151306e-07
Max relative difference: 951.6179
x: array([[-2.597046e-05, 4.191594e-06, -6.009603e-05, ..., 2.606537e-04,
6.171300e-05, -6.382344e-05],
[ 1.673573e-05, -5.885254e-05, -8.349993e-05, ..., -7.531334e-05,...
y: array([[-2.654276e-05, 3.729273e-06, -6.010690e-05, ..., 2.608544e-04,
6.124897e-05, -6.427429e-05],
[ 1.673585e-05, -5.885251e-05, -8.349984e-05, ..., -7.531326e-05,...
FAIL: test_with_bias (__main__.TestMLP)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/apex-1.3.0-py3.9-linux-x86_64.egg/apex/testing/common_utils.py", line 32, in wrapper
fn(*args, **kwargs)
File "/workspace/apex/tests/L0/run_mlp/test_mlp.py", line 116, in test_with_bias
np.testing.assert_allclose(
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 1530, in assert_allclose
assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 844, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-05, atol=1e-07
Mismatched elements: 3 / 1024 (0.293%)
Max absolute difference: 1.899898e-07
Max relative difference: 0.00063155
x: array([[-0.128916],
[-0.052111],
[ 0.001069],...
y: array([[-0.128916],
[-0.052111],
[ 0.001069],...
----------------------------------------------------------------------
Ran 6 tests in 16.497s
FAILED (failures=3)is a special torch/rocm version required for this benchmark ?
many thanks
David
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working