Dmitri Gekhtman
|
8971422d8f
|
[autoscaler] Use drain node api in autoscaler before terminating nodes (#20013)
* wip
* Draft
* Use bytest for node id
* remove stray helm change
* fix autoscaler init arg
* don't forget to instantiate new load metrics dict
* remove extraneous diff
* Timeout, comments, function signature.
* typo
* another comment
* tweak
* docstring
* shorter timeout
* Use a better error code
* missing self
* Dedent example
* Add drain node prometheus metric.
* comment
* Update tests part 1: test_autoscaler.py
* Update tests part 2: test_resource_demand_scheduler
* lint
* Update tests part 3: test_autoscaling_policy
* Unit tests for new Prometheus metric and DrainNode error handling.
* comment
* removed unused function
* Try adding ability to mock out process termination to fake node provider
* Add integration test.
* fix
* fix
* lint
* Improve log message
* fix
* Simplify test
* Fix doc example
* remove unused dict
* Mock out process termination in a subclass
* Add add doc string and comment explaining prune active ips.
* Comment: wtf is use_node_id_as_ip
* one more comment
* more explanation
* period
* tweak
|
2021-11-11 08:31:40 -08:00 |
|