* Smol comment
* WIP, not passing ray.init
* Fixed small problem
* wip
* Pseudo interrupt things
* Basic prototype operational
* correct proc title
* Mostly done
* Cleanup
* cleaner raylet error
* Cleaning up a few loose ends
* Fixing Race Conds
* Prelim testing
* Fixing comments and adding second_check for kill
* Working_new_impl
* demo_ready
* Fixing my english
* Fixing a few problems
* Small problems
* Cleaning up
* Response to changes
* Fixing error passing
* Merged to master
* fixing lock
* Cleaning up print statements
* Format
* Fixing Unit test build failure
* mock_worker fix
* java_fix
* Canel
* Switching to Cancel
* Responding to Review
* FixFormatting
* Lease cancellation
* FInal comments?
* Moving exist check to CoreWorker
* Fix Actor Transport Test
* Fixing task manager test
* chaning clock repr
* Fix build
* fix white space
* lint fix
* Updating to medium size
* Fixing Java test compilation issue
* lengthen bad timeouts
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
* Add additional validation to endpoint and backend creation that ensures there are not duplicates created of either of these. In addition, adds additional validation to split_traffic to make sure both the endpoint and backends exist.
* Fix test to deal with removed serve.link
* Address PR feedback
Co-authored-by: Max Fitton <max@semprehealth.com>
This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.
* reply to the owner only after the actor is successfully created.
* reply immediately if the actor is already created
* fix comment
* add test_actor_creation_task provided by @Stephanie Wang
Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>
SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs).
This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).
* Added small section on installation when using Anaconda. Also fixed an obsolete link to Anaconda.
* Delete more temporary directories when running the doc "make clean".
* Fine-tuning the core Ray API documentation
* Fix doc lines that were too long
Co-authored-by: Dean Wampler <dean@concurrentthought.com>