hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 05:16:49 -04:00

Author	SHA1	Message	Date
Kai Yang	1d5bceddf0	fix java UT about multi-threading (#8014 )	2020-04-27 15:11:22 +08:00
Sven Mika	7ec2223c84	[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188 ) Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).	2020-04-26 23:08:13 +02:00
mehrdadn	b9de9dadd7	Fix Windows build (#8186 ) Co-authored-by: Mehrdad <noreply@github.com>	2020-04-26 13:07:25 -07:00
chaokunyang	5cf49d5edd	Fix streaming ci (#8159 )	2020-04-26 20:56:58 +08:00
fangfengbin	5bff707d20	[GCS]Add in-memory store client (#8144 )	2020-04-26 19:09:26 +08:00
ZhuSenlin	9255fcd516	[GCS] Add node failure detector (#8119 )	2020-04-26 19:08:27 +08:00
fangfengbin	c5d181e3d9	gcs adapts to worker table pub sub (#8182 ) Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>	2020-04-26 17:58:55 +08:00
Richard Liaw	5bc6e32c0a	[autoscaler] latest_dlami update (#8178 )	2020-04-26 00:25:46 -07:00
fangfengbin	f17bea2de5	Fix get gcs server address block bug (#8126 )	2020-04-26 10:01:06 +08:00
Tomasz Wrona	b508166419	Copy initial state of an RNN to a CPU before converting it to a NumPy array (#8097 )	2020-04-25 18:49:09 -07:00
Richard Liaw	b506f87117	[tune] New Doc edits, add Concepts page (#8083 ) Co-Authored-By: Sven Mika <sven@anyscale.io>	2020-04-25 18:25:56 -07:00
ijrsvt	69ff7e3e35	TaskCancellation (#7669 ) * Smol comment * WIP, not passing ray.init * Fixed small problem * wip * Pseudo interrupt things * Basic prototype operational * correct proc title * Mostly done * Cleanup * cleaner raylet error * Cleaning up a few loose ends * Fixing Race Conds * Prelim testing * Fixing comments and adding second_check for kill * Working_new_impl * demo_ready * Fixing my english * Fixing a few problems * Small problems * Cleaning up * Response to changes * Fixing error passing * Merged to master * fixing lock * Cleaning up print statements * Format * Fixing Unit test build failure * mock_worker fix * java_fix * Canel * Switching to Cancel * Responding to Review * FixFormatting * Lease cancellation * FInal comments? * Moving exist check to CoreWorker * Fix Actor Transport Test * Fixing task manager test * chaning clock repr * Fix build * fix white space * lint fix * Updating to medium size * Fixing Java test compilation issue * lengthen bad timeouts	2020-04-25 16:04:52 -07:00
Richard Liaw	9dd3490c38	[tune] Safer try-catch for TensorboardX (#8174 ) Co-Authored-By: Kristian Hartikainen <kristian.hartikainen@gmail.com>	2020-04-25 13:08:37 -07:00
Simon Mo	13c14eac07	[Asyncio] Remove async init legacy code (#8177 ) * [Asyncio] Remove async init legacy code * Fix places that call async_init	2020-04-25 09:32:38 -07:00
Edward Oakes	9dc625318f	[serve] Add basic test for specifying the method in a serve call (#8172 )	2020-04-24 20:15:27 -05:00
Scott Graham	0dc01d8c1e	[autoscaler] Azure versioning (#8168 )	2020-04-24 17:03:55 -07:00
fangfengbin	38dfe5db86	remove store client template (#8160 ) Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>	2020-04-24 21:19:12 +08:00
fangfengbin	713e375d50	[GCS]GCS adapts to job table pub sub (#8145 )	2020-04-24 16:33:25 +08:00
Eric Liang	2298f6fb40	[rllib] Port DQN/Ape-X to training workflow api (#8077 )	2020-04-23 12:39:19 -07:00
Sven Mika	499ad5fbe4	[RLlib] PyTorch version of APPO. (#8120 ) - Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases. - Add learning test cases for APPO torch (both w/ and w/o v-trace). - Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).	2020-04-23 09:11:12 +02:00
Sven Mika	e9ee5c4e5f	[RLlib] Nested action space PR (minimally invasive; torch only + test). (#8101 ) - Add TorchMultiActionDistribution class. - Add framework-agnostic test cases for TorchMultiActionDistribution.	2020-04-23 09:09:22 +02:00
Nick Matthews	a9d8d16b6b	Change memory monitor warning to a logging call (#8137 )	2020-04-22 21:29:18 -07:00
yncxcw	51559c08b9	Fix mis-memory counting in memory monitor for contaienr environment (#8113 ) Co-authored-by: weich <weich@nvidia.com>	2020-04-22 14:32:35 -07:00
Edward Oakes	0bb918f2b1	Disable eager execution to fix test_tensorflow (#8133 )	2020-04-22 15:54:42 -05:00
Edward Oakes	f9f41e5a1a	[serve] Fix nonblocking serve.init() (#8068 )	2020-04-22 11:51:27 -05:00
Tianyi Chen	0204dff1e9	[streaming]Add master and scheduler. (#8044 )	2020-04-22 14:43:56 +08:00
Max Fitton	c486b56c58	Improve Serve API Input Validations (#8124 ) * Add additional validation to endpoint and backend creation that ensures there are not duplicates created of either of these. In addition, adds additional validation to split_traffic to make sure both the endpoint and backends exist. * Fix test to deal with removed serve.link * Address PR feedback Co-authored-by: Max Fitton <max@semprehealth.com>	2020-04-21 19:45:29 -07:00
Simon Mo	95e8ec8c47	[CI] Dashboard+ Tensorboard Lint Hotfix (#8125 )	2020-04-21 16:52:58 -07:00
Edward Oakes	505f3a8714	[serve] Remove serve.link(), rename serve.split() -> serve.set_traffic() (#8072 )	2020-04-21 14:26:07 -05:00
Richard Liaw	6799fbbd5e	[dashboard] Temporarily disable tensorboard (#8121 )	2020-04-21 10:40:46 -07:00
mehrdadn	0a54407961	[CI] Factor out more Travis code and update GitHub Actions (#8085 )	2020-04-21 09:53:08 -07:00
Richard Liaw	fa7eecf48a	[sgd] Avoid parameter "gotcha" for learning rate scheduler (#8107 ) * with-scheduler-creator * none * add_freq * runner * torch	2020-04-21 01:01:04 -07:00
Sven Mika	d15609ba2a	[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106 ) This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.	2020-04-21 09:47:52 +02:00
Qing Wang	d66d12661b	Improve the perf of constructing actor task specs. (#8093 )	2020-04-21 11:54:09 +08:00
Stephanie Wang	eefea4e29c	[core] Post task submission to IO loop (#8090 ) * Post to IO loop * Unused * Fix build	2020-04-20 19:13:50 -07:00
Ujval Misra	708dff6d8f	[tune] Stop-gap fix for PBT checkpointing (#7794 ) * Fix PBT * lint * reset * rm * tests Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-04-20 15:10:36 -07:00
Edward Oakes	213d3894ca	Remove serve.route decorator (#8108 )	2020-04-20 16:22:25 -05:00
Stephanie Wang	1323e1753d	[core] When reconstruction is enabled, pin objects created by ray.put() (#8021 ) * Unit test and pin ray.put objects until they have no more lineage references * c++ tests * lint * Mark ray.put objects as pinned	2020-04-20 13:09:54 -07:00
Eric Liang	17e3c545d9	[rllib] Fix truncate episodes mode in central critic example (#8073 )	2020-04-20 12:58:01 -07:00
Sven Mika	3812bfedda	[RLlib] PyTorch version of ES (Evolution Strategies). (#8104 ) PyTorch version of Evolution Strategies (ES) Algo.	2020-04-20 21:47:28 +02:00
Richard Liaw	9f3e9e7e9f	[tune] Add more intensive tests (#7667 ) * make_heavier_tests * help	2020-04-20 11:14:44 -07:00
Edward Oakes	793e616a2d	Fix job table parsing (#8070 )	2020-04-20 12:56:43 -05:00
Bill Chambers	77655749fb	[RayServe] RayServe Introduction and Overview (#8038 )	2020-04-20 12:05:59 -05:00
Sven Mika	d6cb7d865e	[RLlib] Torch DQN (APEX) TD-Error/prio. replay fixes. (#8082 ) PyTorch APEX_DQN with Prioritized Replay enabled would not work properly due to the td_error not being retrievable by the AsyncReplayOptimizer.	2020-04-20 10:03:25 +02:00
mehrdadn	c8b9a357f2	Try to fix dependency issue (#8065 ) Co-authored-by: Mehrdad <noreply@github.com>	2020-04-19 16:09:29 -07:00
ZhuSenlin	3f28a8a229	[GCS] reply to the owner only after the actor has been successfully created. (#8079 ) * reply to the owner only after the actor is successfully created. * reply immediately if the actor is already created * fix comment * add test_actor_creation_task provided by @Stephanie Wang Co-authored-by: senlin.zsl <senlin.zsl@antfin.com>	2020-04-19 09:53:02 -07:00
Edward Oakes	da296bf8c5	[serve] Router fault tolerance (#8008 )	2020-04-19 11:04:06 -05:00
Sven Mika	165a86f1ab	[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 ) SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs). This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).	2020-04-19 10:20:23 +02:00
Sumanth Ratna	bdb03a0544	[tune] Update dragonfly installation instructions (#8086 ) Closes #8084	2020-04-18 20:25:38 -07:00
Dean Wampler	5d2885c609	Minor Ray API doc refinements (#8060 ) * Added small section on installation when using Anaconda. Also fixed an obsolete link to Anaconda. * Delete more temporary directories when running the doc "make clean". * Fine-tuning the core Ray API documentation * Fix doc lines that were too long Co-authored-by: Dean Wampler <dean@concurrentthought.com>	2020-04-18 15:19:35 -07:00

1 2 3 4 5 ...

4488 commits