* Add the Delete function in GCS
* Unify BatchDelete and Delete
* Fix comment
* Lint
* Refine according to comments
* Unify test.
* Address comment
* C++ lint
* Update ray_redis_module.cc
## What do these changes do?
It saves checkpoint if needed regardless of what the scheduler have returned. Until now, it have not saved the checkpoint when scheduler returned TrialScheduler.PAUSE, which caused PopulationBasedTraining preventing to save any checkpoints in certain cases. See issue #4041 for more details.
## Related issue number
#4041
* Implement Actor checkpointing
* docs
* fix
* fix
* fix
* move restore-from-checkpoint to HandleActorStateTransition
* Revert "move restore-from-checkpoint to HandleActorStateTransition"
This reverts commit 9aa4447c1e3e321f42a1d895d72f17098b72de12.
* resubmit waiting tasks when actor frontier restored
* add doc about num_actor_checkpoints_to_keep=1
* add num_actor_checkpoints_to_keep to Cython
* add checkpoint_expired api
* check if actor class is abstract
* change checkpoint_ids to long string
* implement java
* Refactor to delay actor creation publish until checkpoint is resumed
* debug, lint
* Erase from checkpoints to restore if task fails
* fix lint
* update comments
* avoid duplicated actor notification log
* fix unintended change
* add actor_id to checkpoint_expired
* small java updates
* make checkpoint info per actor
* lint
* Remove logging
* Remove old actor checkpointing Python code, move new checkpointing code to FunctionActionManager
* Replace old actor checkpointing tests
* Fix test and lint
* address comments
* consolidate kill_actor
* Remove __ray_checkpoint__
* fix non-ascii char
* Loosen test checks
* fix java
* fix sphinx-build
* Stream logs to driver by default.
* Fix from rebase
* Redirect raylet output independently of worker output.
* Fix.
* Create redis client with services.create_redis_client.
* Suppress Redis connection error at exit.
* Remove thread_safe_client from redis.
* Shutdown driver threads in ray.shutdown().
* Add warning for too many log messages.
* Only stop threads if worker is connected.
* Only stop threads if they exist.
* Remove unnecessary try/excepts.
* Fix
* Only add new logging handler once.
* Increase timeout.
* Fix tempfile test.
* Fix logging in cluster_utils.
* Revert "Increase timeout."
This reverts commit b3846b89040bcd8e583b2e18cb513cb040e71d95.
* Retry longer when connecting to plasma store from node manager and object manager.
* Close pubsub channels to avoid leaking file descriptors.
* Limit log monitor open files to 200.
* Increase plasma connect retries.
* Add comment.