ray/test
Eric Liang af0c1174cd
[sgd] Merge sharded param server based SGD implementation (#3033)
This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly.

$ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \
  --devices-per-worker=M --strategy=<simple|ps> \
  --warmup --object-store-memory=10000000000

Images per second total
gpus total              | simple | ps
========================================
1                       | 218
2 (1 worker)            | 388
4 (1 worker)            | 759
4 (2 workers)           | 176    | 623
8 (1 worker)            | 985
8 (2 workers)           | 349    | 1031
16 (2 nodes, 2 workers) | 600    | 1661
16 (2 nodes, 4 workers) | 468    | 1712   <--- OSDI perf was 1817
2018-10-27 21:25:02 -07:00
..
jenkins_tests [sgd] Merge sharded param server based SGD implementation (#3033) 2018-10-27 21:25:02 -07:00
travis-ci Migrate repositories to ray-project. (#438) 2016-09-17 00:52:05 -07:00
actor_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
array_test.py Convert some unittests to pytest. (#2779) 2018-08-31 11:24:15 -07:00
autoscaler_test.py [autoscaler] Cleanup Logging (#2709) 2018-08-25 17:08:45 -07:00
component_failures_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
credis_test.py Convert some unittests to pytest. (#2779) 2018-08-31 11:24:15 -07:00
cython_test.py Convert asserts in unittest to pytest (#2529) 2018-08-01 22:32:10 -07:00
failure_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
microbenchmarks.py Convert some unittests to pytest. (#2779) 2018-08-31 11:24:15 -07:00
monitor_test.py Re-enable sharded monitor test for xray, convert to pytest. (#2804) 2018-09-01 19:53:40 -07:00
multi_node_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
multi_node_test_2.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
recursion_test.py Fix text verbosity in python 2.7 by running tests with pytest (#2470) 2018-07-30 11:04:06 -07:00
runtest.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
stress_tests.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
tempfile_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00
tensorflow_test.py Convert some unittests to pytest. (#2779) 2018-08-31 11:24:15 -07:00
xray_test.py Remove legacy Ray code. (#3121) 2018-10-26 13:36:58 -07:00