mirror of
https://github.com/vale981/ray
synced 2025-03-08 19:41:38 -05:00
![]() This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly. $ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \ --devices-per-worker=M --strategy=<simple|ps> \ --warmup --object-store-memory=10000000000 Images per second total gpus total | simple | ps ======================================== 1 | 218 2 (1 worker) | 388 4 (1 worker) | 759 4 (2 workers) | 176 | 623 8 (1 worker) | 985 8 (2 workers) | 349 | 1031 16 (2 nodes, 2 workers) | 600 | 1661 16 (2 nodes, 4 workers) | 468 | 1712 <--- OSDI perf was 1817 |
||
---|---|---|
.. | ||
jenkins_tests | ||
travis-ci | ||
actor_test.py | ||
array_test.py | ||
autoscaler_test.py | ||
component_failures_test.py | ||
credis_test.py | ||
cython_test.py | ||
failure_test.py | ||
microbenchmarks.py | ||
monitor_test.py | ||
multi_node_test.py | ||
multi_node_test_2.py | ||
recursion_test.py | ||
runtest.py | ||
stress_tests.py | ||
tempfile_test.py | ||
tensorflow_test.py | ||
xray_test.py |