Commit graph

20 commits

Author SHA1 Message Date
Eric Liang
e5863d7914
Force tune tests to run in direct call mode (#6301)
* force tune direct mode

* force tune

* fix

* Update run_multi_node_tests.sh
2019-11-27 19:58:33 -08:00
Eric Liang
64a3a7239e
Set RAY_FORCE_DIRECT=1 for run_rllib_tests, test_basic (#6171) 2019-11-25 14:12:11 -08:00
daiyaanarfeen
8f6d73a93a [sgd] Extend distributed pytorch functionality (#5675)
* raysgd

* apply fn

* double quotes

* removed duplicate TimerStat

* removed duplicate find_free_port

* imports in pytorch_trainer

* init doc

* ray.experimental

* remove resize example

* resnet example

* cifar

* Fix up after kwargs

* data_dir and dataloader_workers args

* formatting

* loss

* init

* update code

* lint

* smoketest

* better_configs

* fix

* fix

* fix

* train_loader

* fixdocs

* ok

* ok

* fix

* fix_update

* fix

* fix

* done

* fix

* fix

* fix

* small

* lint

* fix

* fix

* fix_test

* fix

* validate

* fix

* fi
2019-11-05 11:16:46 -08:00
Richard Liaw
d52a4983af
Update TF documentation (#5918) 2019-10-16 01:31:27 -07:00
Richard Liaw
10f21fa313
[docs] Convert Examples to Gallery (#5414) 2019-09-24 15:46:56 -07:00
jichan3751
1711e202a3 [training] Tensorflow interface for MultiNode SGD (#5440) 2019-09-03 15:35:42 -07:00
Richard Liaw
411f30c125
[docs] Second push of changes (#5391) 2019-08-28 17:54:15 -07:00
jichan3751
de95117e96 [sgd] Tune interface for Pytorch MultiNode SGD (#5350) 2019-08-10 13:51:44 -07:00
Richard Liaw
7e715520e5
[sgd] Example for Training (#5292) 2019-07-27 01:10:25 -07:00
Peter Schafhalter
c2ade075a3 [sgd] Distributed Training via PyTorch (#4797)
Implements distributed SGD using distributed PyTorch.
2019-06-01 21:39:22 -07:00
Eric Liang
ce66a552bf
Move large mem test to end (#4664) 2019-04-19 11:43:22 -07:00
Robert Nishihara
fd2d8c2c06 Remove Jenkins backend tests and add new long running stress test. (#4288) 2019-03-08 15:29:39 -08:00
Eric Liang
437459f40a
[build] Make travis logs not as long (#4213)
* clean it up

* Update .travis.yml

* Update .travis.yml

* update

* fix example

* suppress

* timeout

* print periodic progress

* Update suppress_output

* Update run_silent.sh

* Update suppress_output

* Update suppress_output

* manually do timeout

* sleep 300

* fix test

* Update run_silent.sh

* Update suppress_output

* Update .travis.yml
2019-03-07 12:09:03 -08:00
Richard Liaw
a27cb225b6
Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Robert Nishihara
4b89eebfc7 Move test folders under rllib/tune from test -> tests. (#4214) 2019-03-02 13:37:16 -08:00
Eric Liang
b809ef0107
[rllib] Silent tests (#4151) 2019-02-28 16:32:22 -08:00
Philipp Moritz
9ca9691cdc Fix mnist sgd jenkins tests on master (#4168) 2019-02-27 16:02:18 -08:00
Kristian Hartikainen
524e69a82d [autoscaler] Change the get behavior of node providers' _get_node (#4132)
* Change the get behavior of GCPNodeProvider._get_node

* Add lock around the GCPNodeProvider._get_node call

* rename nodes

* lint

* Update GCPNodeProvider._get_node to match aws implementation

* assert

* log

* log highest heartbeats

* rename

* bringup to connected

* prune heartbeat times

* fix bringup
2019-02-24 18:43:35 -08:00
Eric Liang
d9da183c7d
[rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
William Ma
c7a4c74f55 Moving tests from test/ to python/ray/tests/ (#3950) 2019-02-21 11:09:08 -08:00
Renamed from test/jenkins_tests/run_multi_node_tests.sh (Browse further)