mirror of
https://github.com/vale981/ray
synced 2025-03-08 19:41:38 -05:00

* Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * Add an RLlib Tune experiment to UserTest suite. * Add ray.init() * Move example script to example/tune/, so it can be imported as module. * add __init__.py so our new module will get included in python wheel. * Add block device to RLlib test instances. * Reduce disk size a little bit. * Add metrics reporting * Allow max of 5 workers to accomodate all the worker tasks. * revert disk size change. * Minor updates * Trigger build * set max num workers * Add a compute cfg for autoscaled cpu and gpu nodes. * use 1gpu instance. * install tblib for debugging worker crashes. * Manually upgrade to pytorch 1.9.0 * -y * torch=1.9.0 * install torch on driver * bump timeout * Write a more informational result dict. * Revert changes to compute config files that are not used. * add smoke test * update * reduce timeout * Reduce the # of env per worker to 1. * Small fix for getting trial_states * Trigger build * simply result dict * lint * more lint * fix smoke test Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
27 lines
1.1 KiB
YAML
Executable file
27 lines
1.1 KiB
YAML
Executable file
base_image: "anyscale/ray-ml:pinned-nightly-py37-gpu"
|
|
env_vars: {}
|
|
debian_packages:
|
|
- unzip
|
|
- zip
|
|
|
|
python:
|
|
# These dependencies should be handled by requirements_rllib.txt and
|
|
# requirements_ml_docker.txt
|
|
pip_packages:
|
|
- torch==1.9.0 # TODO(amogkam): Remove after nightly images are available.
|
|
conda_packages: []
|
|
|
|
post_build_cmds:
|
|
# Create a couple of soft links so tf 2.4.3 works with cuda 11.2.
|
|
# TODO(jungong): remove these once product ray-ml docker gets upgraded to use tf 2.5.0.
|
|
- sudo ln -s /usr/local/cuda /usr/local/nvidia
|
|
- sudo ln -s /usr/local/cuda/lib64/libcusolver.so.11 /usr/local/cuda/lib64/libcusolver.so.10
|
|
- pip install tensorflow==2.5.0
|
|
# END: TODO
|
|
|
|
- pip uninstall -y ray || true
|
|
- pip3 install -U {{ env["RAY_WHEELS"] | default("ray") }}
|
|
- {{ env["RAY_WHEELS_SANITY_CHECK"] | default("echo No Ray wheels sanity check") }}
|
|
# Clone the rl-experiments repo for offline-RL files.
|
|
- git clone https://github.com/ray-project/rl-experiments.git
|
|
- cp rl-experiments/halfcheetah-sac/2021-09-06/halfcheetah_expert_sac.zip ~/.
|