This PR is an edit pass on the Performance Tuning page after reading it with fresh eyes. None of the content was out of date so it's mostly nits and rewording some parts that were slightly confusing.
- Move autoscaling architecture from autoscaling page to architecture page
- Update architecture page
- Remove "Router" actor
- Update description of ServeHandle
- Update defaults about HTTPproxy (default one on each node -> default just one per cluster, on the head node)
- Add note about fault tolerance in different failure scenarios
- Assorted typos/usage nits
Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
- new section of doc for autoscaling (introduction of serve autoscaling and config parameter)
- Remove the version requirement note inside the doc
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>