ray/release/ml_user_tests/tune_rllib/driver_requirements.txt

# Make sure the driver versions are the same as cluster versions.
# The cluster uses ray-ml Docker image.
# ray-ml Docker image installs dependencies from ray/python/requirements/ml/ directory.
# We constrain on these requirements file so that the same versions are installed.
-c ../../../python/requirements/ml/requirements_dl.txt

tensorflow
torch
[RLlib] Add an RLlib Tune experiment to UserTest suite. (#19807) * Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * bump timeout * Write a more informational result dict. * Revert changes to compute config files that are not used. * add smoke test * update * reduce timeout * Reduce the # of env per worker to 1. * Small fix for getting trial_states * Trigger build * simply result dict * lint * more lint * fix smoke test Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com> 2021-11-03 17:04:27 -07:00			`# Make sure the driver versions are the same as cluster versions.`
			`# The cluster uses ray-ml Docker image.`
			`# ray-ml Docker image installs dependencies from ray/python/requirements/ml/ directory.`
			`# We constrain on these requirements file so that the same versions are installed.`
[Release] Refactor User Tests (#20028) * wip * add directory * wip * try again * Revert "try again" This reverts commit 82d33ccea6f92848df025e019b87df73cea49e5d. * finish * formatting * fix merge * fix path * chmod * check * sudo * wip * update * fix horovod * try * typo * reduce num workers 2021-11-05 17:28:37 -07:00			`-c ../../../python/requirements/ml/requirements_dl.txt`
[RLlib] Add an RLlib Tune experiment to UserTest suite. (#19807) * Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * bump timeout * Write a more informational result dict. * Revert changes to compute config files that are not used. * add smoke test * update * reduce timeout * Reduce the # of env per worker to 1. * Small fix for getting trial_states * Trigger build * simply result dict * lint * more lint * fix smoke test Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com> 2021-11-03 17:04:27 -07:00
			`tensorflow`
			`torch`